# RELATIONAL DATABASE THEORY ON NORMAL FORMS

Click here for audio-text lecture (for both this unit and the previous one) and feed it to the speech agent
Click here for an audio lecture that can be played using RealPlayer

## FUNCTIONAL DEPENDENCY

X -> Y, or X functionally determines Y if and only if whenever two tuples of a relational instance r(R) of R agree on their X-value, they must also agree on the Y-value.

Example: SSN -> NAME

We say X determines Y, or Y is dependent on X

Example: {S#,P#} -> QTY

Since QTY depends on BOTH attributes S# and P#, this is called a FULL functional dependency.

Since SSN -> ADDRESS, this is called a partial functional dependency.

We also write:

S# P# -> QTY

# NORMALIZATION

The relation on the left-hand side is unnormalized and contains nested relations. The relation on the right-hand side is normalized. This is the first normal form.

# NORMAL FORMS

• A relation is in a "normal form" if it satisfies a certain set of constraints.

• First Normal Form (1NF): A relation's underlying domains contain atomic values only.

• Second Normal Form (2NF): A relation's every nonkey attribute is fully dependent on the primary key.

• Third Normal Form (3NF): A relation's nonkey attributes are: (a) mutually independent, and (b) fully dependent on the primary key.

• EVERY RELATION MUST BE IN 1NF. This is one of the basic properties of a relation. Not all 1NF relation is in 2NF. Not all 2NF relation is in 3NF.

# EXAMPLE OF NORMALIZATION

The following relation FIRST is in first normal form.
```S#       STATUS     CITY      P#    QTY
---      ------   --------   ---   ----
S1          20   London      P1   300
S1          20   London      P2   200
S1          20   London      P3   400
S1          20   London      P4   200
S1          20   London      P5   100
S1          20   London      P6   100
S2          10    Paris      P1   300
S2          10    Paris      P2   400
S3          10    Paris      P2   200
S4          20   London      P2   200
S4          20   London      P4   300
S4          20   London      P5   400
```

The FIRST relation has the following functional dependencies:

The functional dependencies in relations S, P and SP:

# UPDATE ANOMALIES

• Relation FIRST(S#,STATUS,CITY,P#,QTY)

• INSERT anomaly

We cannot insert (S5,20,London) into the relation FIRST.

• DELETE anomaly

If we delete (S3,10,Paris,P2,200), we lose the information that S3 is in Paris.

SOLUTION

• Decompose FIRST into two relations: SECOND(S#,STATUS,CITY) and SP(S#,P#,QTY)

• Decompose SECOND into two relations: SC(S#, CITY) and CS(CITY, STATUS)

• We end up with three 3NF relations SP, SC and CS.
Functional dependencies in the relations SECOND and SP.

# KINDS OF NORMAL FORMS

The following diagram illustrates the various kinds of normal forms.

# KINDS OF RELATIONS

(duplicated page)

• Base relations: The real relations. Called "base table" in SQL.

• Views: The virtual relations. A view is a named, derived relation.

• Snapshots: A snapshot is a real, not virtual, named derived relation.

• Query results: The final output relation from a specified query. It may not be named and has no permanent existence.

• Temporary relations: A nonpermanent named derived relation.

# INFERENCE RULES FOR FUNCTIONAL DEPENDENCIES

IR1 (Reflective rule) IF X contains Y, then X -> Y
IR2 (Augmentation rule) {X->Y} implies XZ -> YZ
IR3 (Transitive rule) {X->Y,Y->Z} imples X->Z
IR4 (Projection rule) {X->YZ} implies X->Y
IR5 (Union rule) {X->Y,X->Z} implies X->YZ
IR6 (Pseudotransitive rule) {X->Y,WY->Z} implies WX->Z

Inference rules IR1 to IR3 (Armstrong) are SOUND and COMPLETE
SOUND (inferred dependencies hold if the original set of dependencies hold for r(R))
COMPLETE (don't need more rules)