Normalization of Tables - From 1NF to 2NF
When converting an unnormalized table (i.e., one containing repeated groups) to
table that is in first normal form, one generally ends up with a table that
still has many problems. The Orders table given below is a good example that
illustrates these problems.
In our PowerPoint lecture notes Topics-Ch4Ch5.ppt we discussed the various normal forms
and the reasons for using them when designing our database tables. Namely, that they minimize the chance of update
anomalies occurring. Let's take a look at the Orders table given in Figure 5-7, p. 157 of our textbook.
The description of a specific part (DR93, for example) occurs twice in the table. This redundancy causes several problems.
It is wasteful of space, but more serious are the update anomalies that they cause.
- Making a change to a part description may require changes in many rows - more work, more time, more complicated logically,
increased chance of human error.
- A part that appears in many rows could have many different descriptions. This causes
inconsistencies, for example, how
would we find all of the information pertaining to a part given just its description.
- The primary key in this table is (OrderNum, PartNum). So if we wanted to insert a new part into the table, how would we
do it if there were not yet any orders for the part, i.e., what would we enter for OrderNum?
- What if we deleted an order. For example, what if order 21608 were deleted? Then we would lose information about part
AT94 - you would no longer know that part AT94 is an Iron.
Why do all these anomalies exist?
If you think about each of the four situations given above, you'll realize that what is causing the problems is that the
attribute Description depends on PartNum, but not on OrderNum.
That is the 2NF condition we call partial dependency.
There is an algorithm for removing partial dependencies.
- Form all subsets of the attributes making up the primary key.
- Begin a new table for each subset, using the subset as the primary key.
- Now, from the original table, add to each subset the attributes that depend on that subsets primary key.
- Name each of the new tables appropriately.
Let's see how this works for our Orders table. Remember, the primary key there was (OrderNum, PartNum). So after steps 1 and
2 we have three subsets (as tables):
(OrderNum,
(PartNum,
(OrderNum, PartNum,
Applying step 3: OrderDate only depends on OrderNum, so it gets put in the first subset (table). Description only depends on
PatNum, so it gets put in the second subset. Both NumOrdered and QuotedPrice depend on OrderNum and PartNum, so they get put
in the third subset. This, we get the resulting three tables:
(OrderNum, OrderDate)
(PartNum, Description)
(OrderNum, PartNum, NumOrdered, QuotedPrice)
And now, after naming the tables we have:
Now convince yourself that all of the update anomalies have been eliminated.