Information Processing Managers Association Newsletter banner

October 1987

Monthly Association Meeting Agenda

Date: November 5, 1987

Locations: Kings Table
1818 E. 4th

Time: 12:00 Noon

TOPIC: "The Mission and Role of DIS"

Speaker: Will Wolf, DIS


Newsletter Table of Contents

October 1, 1987 Association Meeting Minutes

November 5, 1987, Association Meeting Agenda

Chairman's Message

October 14, 1987, Board Meeting Minutes

Michael Brackett Article on Data Redundancy


We want your input!

The IPMA Newsletter staff is always looking for articles of specific interest to our State information processing community

Send your drafts to:
Kathy Marston, Editor
DIS - 1310 Jefferson


October 1, 1987 Association Meeting Minutes

Submitted by: Tara Wolff, IPMA Secretary

Darrel Riffe, Chairman, opened the meeting at 12:25 at our new location (The Kings Table).

Luncheon Speaker

Program chairman Jeff Boyce introduced Bob Shaw a Consultant from Browns Point Systems. Bob Shaw has a broad DP background and he has worked for a number of organizations including: Boeing, General Dynamics, and Xerox. Bob's topic at the IPMA luncheon was ISDN or Integrated Services Digital Network.

Bob's presentation addressed three areas:

(1) What ISDN Means

(2) How ISDN Came About

(3) ISDN Implications for Those With Signature Authority (i.e., planning or management responsibility)

Bob indicated that ISDN is a hot topic these days and that the amount of literature available is almost overwhelming. He concluded his presentation by presenting five criteria to evaluate ISDN, which he called the 5 "C"s. The 5 C's include: cost, control, capability, connectivity, and complexity.

Business Meeting

There was no old business.

Darrel Rifle introduced Judy Bayard Cushing, who is a Computer Science Professor at the Evergreen State College (TESC). Judy talked about the importance of mixing academic learning with on-the-job practice, and the TESC internship program. Judy encourages state agencies to contact her to find out more about the TESC internship program. Judy also indicated that TESC will be producing some seminars that may be of interest to state employees. In December, Karl Anderson Healy will be conducting a 2-day workshop on Data Modeling (for $125). In addition, she alerted state employees to the fact that TESC plans to do more short workshops which will be open to interested state employees. To find out more about workshops or interns call Judy at 866-6000 (ext. 6652).

Dr. Marti Ruben announced that 1987 flyers for the Information Systems Forum are now available. He indicated that pre-registration for the forum is NOT required, however, advance notice of participation is requested. If you are a state employee, he suggested you contact your HRD designee (non-state employees should call Barbara Yandle at (206) 586-1343). The registration desk at the Forum will be open at 8:00 a.m. both days.

Darrel Rifle indicated that there are still openings for DP Managers Seminars at Pack Forest during the months of November and December. Some sessions are as much as 8 to 10 students light and may be in danger of cancellation ff the number signing up for the seminar does not increase.

Bruce Spong presented sample name tags to be used at IPMA luncheons. Further research will be conducted as to cost and feasibility.

Darrel Rifle asked those in attendance how they liked the new accommodations {The KINGS TABLE). Response was favorable - it was noted that parking was easier and the menu better.

The meeting adjourned at 1:20 p.m.

Treasurer's Report

Bank balance as of 10/1/87:

Checking $15,486.40

Savings $ 8,345.19


November 5, 1987, Association Meeting Agenda

1. Introduction of Guests

2. Guest Speaker

Will Wolf, Department of Information Services

"The Mission and Role of DIS"

3. Approval of Minutes

4. IPMA Board Report

5. ISB Announcements

6. Old Business

7. New Business

8. Correspondence

9. Other Comments/Announcements


Chairman's Message

The eighth annual IPMA Fall Forum was a huge success. We had excellent participation from the vendors and speakers. The topic sessions were timely and highly informative. Vendor displays were well done and, of course, the Forum gives all of us a chance for good conversations and fun. On behalf of the IPMA membership, a special thanks goes to all of the committee members that spent many, many hours planning and making the Forum possible. As many of you know, the job is not easy. Thanks also to the vendors who participated; the speakers, the coordinators, the registrants, the moderators, and all of the other participants and workers.

Despite the fact that the Tyee was under construction, the facilities, in my opinion, were better in some respects than in the past. (The Tyee did assure us that the construction would be completed by the date of the Forum, but it was apparent that is was not). Most important, there was adequate room for everyone in all of the topic sessions. Adequate room was a problem in past Forums. Also, the vendors were all in a single room, and we did not have to travel the entire hotel in fear of missing one of the displays.

One of the new features was pre-arranged lunches and dinner. Both of the lunches turned out well and I encourage that we do that again. The dinner was not well attended even though about 150 members said they planned to go, less than 25 showed up for dinner and the IPMA had to make up the cost for a guaranteed number of 75 persons. I suggest that a dinner not be planned again (those that attended had a great time with entertainment by the Pointless Sisters- they were great).

I am not sure what the attendance was, and won't know until the registration entries are counted, but it appears that attendance met at least last year's count.

Everyone should be reminded to complete the FORUM Evaluation Questionnaire and send it to the Department of Personnel, Employee Development and Training Division. Feedback is important for planning next year's Forum. We will also be asking the participating vendors for their comments. Again Thanks to everyone who helped make Forum '87 successful.


October 14, 1987 Board Meeting Minutes

By: Kathy Marston substituting for Tara Wolff, IPMA Secretary

Darrel Riffe, Chairman, opened the meeting at 11:15 a.m. with a quorum present (Rich Morgan, Gary Longmire, Jeff Boyce, and Glenn Medeiros. Jim Andersen arrived later.)

Darrel reported that as far as he knew, everything was going well for the FORUM. He mentioned that he had received a call from a vendor requesting the IPMA member mailing list. The vendor wanted to send out a flyer about their display at the FORUM. After some discussion it was decided that it would not be appropriate to do that without consulting the membership first and with time so short, that would not be possible. We agreed to pursue the idea for next year, for all vendors.

Darrel brought some sample name tags and the board agreed to pursue having them made for members who attend meetings with any regularity.

There was considerable discussion about what to do about a Professional Development Committee. The general consensus was that the reason none had volunteered to take over the responsibility was that it requires a tremendous amount of time. Rich suggested that if we could contract for the implementation part of the seminars, finding speakers and coordinating times would not be near the burden on anyone. Rich volunteered to arrange to have such a contractor come to the next board meeting and give us an overview of what can be done and what options are available.

There was some discussion about how the organization is set up to function and what policies are in place. Darrel volunteered to give the board an overview at the next meeting.

The meeting adjourned at 12:55 p.m.


Membership Input Sought

The Board is entertaining the idea (at the request of a vendor) of providing the names and addresses of IPMA members to the vendors who purchase space at the Fall FORUM starting in 1988. This would enable the vendors to distribute information about the displays/ presentations they are planning for FORUM ahead of time. Anyone that has any objection to our doing this is invited to call Darrel Riffe at 753-2058 to voice their objections.


Professional Development Workshop in Data Modeling at the Evergreen State College

As was announced at the last IPMA meeting, Evergreen has a two-day seminar coming up in Entity/ Relationship Modeling and Data Modeling. This seminar will provide a thorough practical introduction to logical data modeling. Topics include E/R Modeling, logical database design, translating the E/R Model into a conceptual data model, developing user views, access modeling, view synthesis and reconciliation, relation models, normalization and translation to physical design.

The seminar is scheduled for December 4th and 11th, from 8:30 to 4:00 in Lecture Hall 1. The instructor is Karl Anderson Healy, an Assistant Vice President at Seattle 1st National Bank.

For more information, call Judy Bayard Cushing at 866-6652. Registration is $105 before November 2 and $125 after November 2. To register, call Conference Services at 866-6000, ext. 6192.


Data Redundancy

By: Michael Brackett, Washington State University

INTRODUCTION

Data redundancy is a term that is frequently used, often misused, and generally misunderstood. Statements like "redundant data is bad", "the relational model eliminates redundant data", and "redundant data is needed for operational efficiency" are often heard. To resolve this confusion the tree meaning of data redundancy and its control must be reviewed.

Much of the discussion about redundant data has been brought about by the emergence of the subject database as a replacement to traditional application fries. Traditional application files contained all the data that was needed by a particular application and those files were closely tied to the application. In many cases access to the application fries was hard coded into the application programs. Separate applications had separate application fries and the application tightly controlled the data in those files.

This environment was acceptable when applications were relatively isolated and supported only the core business functions such as personnel, financial, etc. The sets of data for those core business functions were relatively distinct and could be "owned" by a specific user and application. Generally there was very little redundant data between applications and there was very little problem with updating data as changes occurred. By today's standards life was rather easy.

However, as computers became less expensive and more powerful, applications began to spread throughout the enterprise. More business functions were supported and more data was collected and stored. Data that already existed In other application files was extracted, sorted, and merged with new data and new application files were developed for the new applications. These new application files contained redundant data that created a problem with consistently updating that redundant data. This is where the real data redundancy problems started.

As the three schema concept and the relational model evolved, the concept of a subject database emerged as a solution to the problem of data redundancy. Data would be stored by subject so that all characteristics for that subject would be stored and maintained in one place. For instance, all data about an employee wold be stored in one place. Any application that wanted employee data would access that single repository and obtain the necessary data. Any updates to employee data would be made to that single repository.

One of the claims of the subject database concept was that It eliminated all redundant data. However, those that looked closely at a correctly designed subject database could see that the data necessary for navigation between the subjects of data was stored redundantly. Also, as communication networks expand, personal computers and departmental computers proliferate, and information and knowledge bases become popular, the existence of redundant data becomes more obvious. The same data can occur tn multiple locations on multiple databases, that data can be uploaded, downloaded, cross loaded, distributed, and dispersed.

DEFINITION OF DATA REDUNDANCE

So, what is the definition of redundant data? When should it be allowed and when should it be eliminated? How should it be controlled when it is allowed? These questions must be answered to have an effective and efficient database.

The dictionary gives two basic definitions of redundancy. The first is "superfluous or needless repetition that exceeds what is normal or necessary and can be eliminated without loss." The second is "a duplicate or backup to prevent failure or to recover from failure." Both of these definitions apply to data redundancy. Generally, data that meets the first definition should be eliminated, and data that meets the second definition should be maintained.

Therefore, data redundancy falls into two broad categories: that which is necessary and that which is unnecessary. Redundant data that is necessary must be maintained and controlled and redundant data that is unnecessary must be eliminated. The key to maintaining necessary redundant data is controlling the redundancy, This is the major difference between yesterday's redundant data and today's redundant data. Yesterday's redundant data was largely uncontrolled and today's redundant data must be controlled.

TYPES OF DATA REDUNDANCY

With these definitions in place, the types of necessary data redundancy and their control can be reviewed. First, the data needed for navigation in a subject database is necessary redundant data and must be maintained. The data attributes of the primary key of a data entity and the data attributes of the foreign keys in its subordinate data entities are redundant data and are necessary to navigate between parent and subordinate data entities. The values of these keys must be consistently maintained for effective navigation.

Second, the data maintained in a distributed or dispersed database can be redundant. It is necessary redundancy because of operational efficiency and/or cost, and must be controlled very closely to assure that it is in sync on some prescribed interval. In other words, there must be a regular, formal update procedure for keeping the data in sync and the users of that data must know what that procedure is to perform their, business activities accordingly.

Third, data that is down loaded from a primary repository to a secondary repository is redundant data, but differs from the redundant data in a distributed or dispersed database, It is fixed at the point in time when it is downloaded and is not updated according to a regular, formal update procedure. It is a set of data representing an instance of time.

At this point the terms primary and secondary repository need to be explained. A primary repository is a database that is maintained as close to reality as needed. In other words, the instance of the database is as close to the instance of reality as possible for the type of processing being performed. For example, the instance of data for a heart-lung machine during a heart transplant operation must be very close to the instance of reality of the patient, while the instance of data for a customer's account can be farther from the instance of reality for the customer.

A secondary repository is a database that is not maintained and may even be altered for processing purposes. For example, data about the receivables for an enterprise may be downloaded to a personal computer for analysis of trends. As soon as that data is downloaded it is considered to be out of sync with reality. In addition, as the analysis of trends proceeds, that data may be altered to evaluate "what if" alternatives for receivables and investment options.

Fourth, historical data, such as audit trails and archived data, may be necessary redundant data. This data may be on an active database available for access or it may be archived on magnetic tape in a vault. In either case the data is an instance of the database that shows its status at a point in time. Generally this type of data is not altered because it represents reality at a point in time.

At this point the distinction between redundant data values and redundant data attributes need to be explained. Data redundancy a used here means redundant data values representing the same instance of time for the same occurrence of data entity. For example, a customer's current phone number that is stored with each order for that customer creates redundant data values (123-456-789) for the same instance of time {current) for the same occurrence (John J. Smith) of data entity (customer). These redundant data values are' unnecessary and need to be removed.

However, a customer's payments and resulting account balance varies with each payment and represents different instances of the database. These values might be stored as an audit trail and the data attributes names may be the same, i.e., Payment Amount and Customer Balance, but the data is not redundant data. The data attribute name may be the same, but the data values represent different instances of time.

Fifth, is the data needed for backup and recovery which is considered an instance of the database. Each enterprise must decide what data it needs to backup, the frequency, the procedure, the number of generations, etc. Generally, this has been handled very well on mainframes and backbone production applications, and has been handled very poorly on minis and micros. One area of emphasis that needs to be made by an enterprise is the backup and recovery of critical data on minis and micros.

Sixth, the concept of information entries that is evolving may result in apparent storage of more than a multiple subject data entity which, in terms of data normalization, is un-normalized data. However, information entities contain the data that are used to manage the business.

This form of redundant data was not apparent when un-normalized data was in the form on inputs to or outputs from an application, such as documents, reports, etc. However, the concept of uploading documents and downloading reports electronically has resulted in the storage of redundant data. The data in information entities should be considered as instances of the data base.

Conclusion

Data redundancy is the storage of the same data values for the same instance of time for the same occurrence of a data entity. Redundant data can be either necessary or unnecessary, or could be a different instance of time. Data representing a different instance of time is not considered to be redundant data.

Necessary redundant data includes the keys used for navigating the database, and the data maintained in distributed and dispersed databases. This type of redundant data must be maintained and controlled very carefully and users must be aware of the control procedures to assure that data is effectively and efficiently used.

Unnecessary redundant data includes the same data values for the same instance of time for the same occurrence of a data entity fin the same database. Unnecessary redundant data includes data that is explicitly redundant and data that is implicitly redundant, such as the redundant data found in multiple characteristic data attributes. This type of redundant data must be removed from the database.

Different instances of data include backup data, data in information entities, downloaded data, and historical data. This type of data should be carefully defined os that it is not confused with data that is truly redundant. In addition, it should contain some type of a chronology stamp to indicate what instance of time it represents.

A successful database needs to be well planned, controlled, and documented in order to provide maximum support to the enterprise. One aspect of a successful database is the control of data redundancy. If unnecessary data redundancy is removed from the database, necessary data redundancy is properly controlled, and instances of data are properly defined, then a large step has been made in the direction of developing a successful database.