Normalization
Jeremy Vigil
August 24, 2002
DBM/350
Instructor
Vickie Martin
Normalization
Normalization
is removing anomalies from your data.
This can be insert, delete, or update anomalies. Below are explanations of 3 normalization
forms. Each normalization form reduces
the amount of redundancies and anomalies from the previous normalization form.
1nf
1nf
only requires you not to have any redundant columns. This is the most basic normalization. It is also common sense that you do not want redundant
columns. However, this only removes
redundant data it still leaves the problem of having to update multiple fields
for one change. See Below Example
|
|
|
|
|
|
|
|
|
Customer id |
Customer name |
Purchased Item |
Cost |
|
|
|
256 |
John |
Lawnmower |
500 |
|
|
|
257 |
George |
Lawnmower |
500 |
|
|
|
258 |
John |
Rake |
15 |
|
|
|
259 |
Hank |
Rake |
15 |
|
|
|
260 |
Henry |
Lawnmower |
500 |
|
|
|
261 |
Harry |
Drill |
50 |
|
|
|
262 |
Hank |
Drill |
50 |
|
|
|
|
|
|
|
|
2nf
2nf
helps with redundant and update anomalies.
The below example demonstrates the update anomaly. What if we want to sell an additional item
that no one has purchased yet. This can
be chain saw. If we add it to the above
table, we would be missing customer information. Below we can add the chain saw and price with out having to
update the customer information.
|
|
|
|
|
|
|
|
Customer id |
Customer name |
Purchased Item |
|
|
|
256 |
John |
Lawnmower |
|
|
|
257 |
George |
Lawnmower |
|
|
|
258 |
John |
Rake |
|
|
|
259 |
Hank |
Rake |
|
|
|
260 |
Henry |
Lawnmower |
|
|
|
261 |
Harry |
Drill |
|
|
|
262 |
Hank |
Drill |
|
|
|
|
|
|
|
|
|
|
Purchase Item |
Cost |
|
|
|
|
Lawnmower |
500 |
|
|
|
|
Drill |
50 |
|
|
|
|
Rake |
15 |
|
|
|
|
|
|
|
3nf
3nf reduces transitive
dependencies. This is another layer
added to 2nf, which further reduces the redundant data. Below is an example from the book.
|
|
|
|
|
|
|
|
|
|
Sid |
Building |
Fee |
|
|
|
|
|
100 |
Randolph |
1200 |
|
|
|
|
|
150 |
Ingersoll |
1100 |
|
|
|
|
|
200 |
Randolph |
1200 |
|
|
|
|
|
250 |
Pitkin |
1100 |
|
|
|
|
|
300 |
Randolph |
1200 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sid |
Building |
|
Building |
Fee |
|
|
|
100 |
Randolph |
|
Randolph |
1200 |
|
|
|
150 |
Ingersoll |
|
Ingersoll |
1100 |
|
|
|
200 |
Randolph |
|
Pitkin |
1100 |
|
|
|
250 |
Pitkin |
|
|
|
|
|
|
300 |
Randolph |
|
|
|
|
|
|
|
|
|
|
|
|
Justification
I
would select 3nf to do normalization.
It reduces most of the redundant data and manipulation anomalies. Although Domain key takes care of all
redundant data, it would probably be too time consuming to implement. 1nf and 2nf do not reduce redundant data
enough to use.
Give example