Database Management System: Unit II: Databases Design

Normal Forms

Databases Design - Database Management System

Normalization is the process of reorganizing data in a database so that it meets two basic requirements: 1) There is no redundancy of data (all data is stored in only one place), and 2) data dependencies are logical (all related data items are stored together)

Normal Forms

Normalization is the process of reorganizing data in a database so that it meets two basic requirements:

    1) There is no redundancy of data (all data is stored in only one place), and

    2) data dependencies are logical (all related data items are stored together)

Need for normalization

1) It eliminates redundant data.

2) It reduces chances of data error.

3) The normalization is important because it allows database to take up less disk space.

4) It also help in increasing the performance.

5) It improves the data integrity and consistency.

First Normal Form

The table is said to be in 1NF if it follows following rules –

i) It should only have single (atomic) valued attributes/columns.

ii) Values stored in a column should be of the same domain

iii) All the columns in a table should have unique names.

iv) And the order in which data is stored, does not matter.

Consider following Student table

Student

As there are multiple values of phone number for sid 1 and 3, the above table is not in 1NF. We can make it in 1NF. The conversion is as follows -

Second Normal Form

Before understanding the second normal form let us first discuss the concept of j functional dependency and prime and non prime attributes.

Concept of Partial Functional Dependency

Partial dependency means that a nonprime attribute is functionally dependent on part of a candidate key.

For example: Consider a relation R(A,B,C,D) with functional dependency {AB->CD,A->C}

Here (AB) is a candidate key because

(AB)+ = {ABCD} = {R}

Hence {A,B) are prime attributes and {C,D) are non prime attribute. In A->C, the non prime attribute C is dependent upon A which is actually a part of candidate key AB. Hence due to A->C we get partial functional dependency.

Prime and Non Prime Attributes

Prime attribute: An attribute, which is a part of the candidate-key, is known as a prime attribute.

Non-prime attribute: An attribute, which is not a part of the prime-key, is said to be a non-prime attribute.

Example : Consider a Relation R ={A,B,C,D) and candidate key as AB, the Prime attributes: A, B

Non Prime attributes: C, D

The Second Normal Form

For a table to be in the Second Normal Form, following conditions must be followed

 i) It should be in the First Normal form.

ii) It should not have partial functional dependency.

For example: Consider following table in which every information about a the Student is maintained in a table such as student id(sid), student name(sname), course id(cid) and course name(cname).

Student_Course

This table is not in 2NF. For converting above table to 2NF we must follow the following steps -

Step 1: The above table is in 1NF.

Step 2: Here sname and sid are associated similarly cid and cname are associated with each other. Now if we delete a record with sid=2, then automatically the course C++ will also get deleted. Thus,

sid->sname or cid->cname is a partial functional dependency, because {sid,cid} should be essentially a candidate key for above table. Hence to bring the above table to 2NF we must decompose it as follows:

Student:

Course:

Thus now table is in 2NF as there is no partial functional dependency

Third Normal Form

Before understanding the third normal form let us first discuss the concept of transitive dependency, super key and candidate key.

Concept of Transitive Dependency

A functional dependency is said to be transitive if it is indirectly formed by two functional dependencies. For example -

X -> Z is a transitive dependency if the following functional dependencies hold true:

X->Y

Y->Z

Concept of Super key and Candidate Key

Superkey: A super key is a set or one of more columns (attributes) to uniquely identify rows in a table.

Candidate key: The minimal set of attribute which can uniquely identify a tuple is known as candidate key. For example consider following table:

Superkeys

{RegID}

{RegID, RollNo}

{RegID,Sname}

{RollNo,Sname}

{RegID, RollNo,Sname}

Candidate Keys

{RegID}

{RollNo}

Third Normal Form

A table is said to be in the Third Normal Form when,

i) It is in the Second Normal form.(i.e. it does not have partial functional dependency)

ii) It doesn't have transitive dependency.

Or in other words

In other words 3NF can be defined as: A table is in 3NF if it is in 2NF and for each functional dependency

X-> Y

at least one of the following conditions hold :

i) X is a super key of table

ii) Y is a prime attribute of table

For example: Consider following table Student_details as follows -

Here

Super keys: {sid}, {sid,sname}, {sid,sname,zipcode}, {sid,zipcode,cityname}... and so on.

Candidate keys:{sid}

Non-Prime attributes: {sname,zipcode,cityname,state}

The dependencies can be denoted as,

sid->sname

sid->zipcode

zipcode->cityname

cityname->state

The above denotes the transitive dependency. Hence above table is not in 3NF. We can convert it into 3NF as follows:

Student

Zip

Example 2.11.1 Consider the relation R = {A, B, C, D, E, F, G, H, I, J} and the set of functional dependencies F= {{A, B} C, A {D, E}, BF, F {G, H}, D {I, J} }

1. What is the key for R? Demonstrate it using the inference rules.

2. Decompose R into 2NF, then 3NF relations.

Solution: Let,

A DE (given)

A D, A E

AsDIJ, AIJ

Using union rule we get

A DEIJ

As  A A

we get    A ADEIJ

Using augmentation rule we compute AB

AB ABDEIJ

But

ABC (given)

AB ABCDEIJ

B F (given) F GH ..         B GH (transitivity)

AB AGH is also true

Similarly  AB AF               BF (given)

Thus now using union rule

AB ABCDEFGHIJ

AB is a key

The table can be converted to 2NF as,

R1= (A, B, C)

R2= (A, D, E, I, J) smoot

R3= (B, F, G, H)

The above 2NF relations can be converted to 3NF as follows:

R1= (A, B, C)

R2= (A, D, E)

R1= (D, I, J)

R1= (B, E)

R3= (E, G, H).


Review Questions

1. What is database normalization? Explain the first normal form, second normal form and third normal form.AU: May-18, Marks 13; Dec.-15, Marks 16

2. What are normal forms. Explain the types of normal form with an example. AU: Dec.-14, Marks 16

3. What is normalization? Explain in detail about all Normal forms.   AU: May-19, Marks 13
4. Briefly discuss about the functional dependency concepts.    AU: May-19, Marks 13

Database Management System: Unit II: Databases Design : Tag: : Databases Design - Database Management System - Normal Forms