0% found this document useful (0 votes)
507 views9 pages

What Is Normalization

Normalization is a database design technique that organizes data to minimize redundancy. It involves dividing large tables into smaller tables and linking them. The document discusses the various normal forms including 1NF, 2NF, 3NF, and BCNF. 1NF requires each cell contain a single value. 2NF requires tables be in 1NF and have no partial dependencies. 3NF requires tables be in 2NF with no transitive dependencies. BCNF is sometimes called 3.5NF and is free from certain types of redundancy even when a table is in 3NF. Normalization aims to reduce data anomalies and improve data integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
507 views9 pages

What Is Normalization

Normalization is a database design technique that organizes data to minimize redundancy. It involves dividing large tables into smaller tables and linking them. The document discusses the various normal forms including 1NF, 2NF, 3NF, and BCNF. 1NF requires each cell contain a single value. 2NF requires tables be in 1NF and have no partial dependencies. 3NF requires tables be in 2NF with no transitive dependencies. BCNF is sometimes called 3.5NF and is free from certain types of redundancy even when a table is in 3NF. Normalization aims to reduce data anomalies and improve data integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

What is Normalization?

1NF, 2NF, 3NF & BCNF with


Examples
What is Normalization?
Normalization is a database design technique which organizes tables in a
manner that reduces redundancy and dependency of data.

It divides larger tables to smaller tables and links them using relationships.

In this tutorial, you will learn-

 Database Normal Forms


 1NF Rules
 What is a KEY?
 What is Composite Key
 2NF Rules
 Database - Foreign Key
 What are transitive functional dependencies?
 3NF Rules
 Boyce-Codd Normal Form (BCNF)

The inventor of the relational model Edgar Codd proposed the theory of
normalization with the introduction of First Normal Form, and he continued
to extend theory with Second and Third Normal Form. Later he joined with
Raymond F. Boyce to develop the theory of Boyce-Codd Normal Form. 

Theory of Data Normalization in SQL is still being developed further. For


example, there are discussions even on 6th Normal Form. However, in
most practical applications, normalization achieves its best in
3rd Normal Form. The evolution of Normalization theories is illustrated
below-

Functional Dependency
The functional dependency is a relationship that exists between two attributes. It typically exists between t
attribute within a table.

1. X   →   Y  

The left side of FD is known as a determinant, the right side of the production is known as a dependent.

For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address.

Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if we know
employee name associated with it.

Functional dependency can be written as:

1. Emp_Id → Emp_Name   

We can say that Emp_Name is functionally dependent on Emp_Id.

Types of Functional dependency

1. Trivial functional dependency


o A → B has trivial functional dependency if B is a subset of A.
o The following dependencies are also trivial like: A → A, B → B

Example:

1. Consider a table with two columns Employee_Id and Employee_Name.  
2. {Employee_id, Employee_Name}   →    Employee_Id is a trivial functional dependency as   
3. Employee_Id is a subset of {Employee_Id, Employee_Name}.  
4. Also, Employee_Id → Employee_Id and Employee_Name   →    Employee_Name are trivial dependenci

2. Non-trivial functional dependency


o A → B has a non-trivial functional dependency if B is not a subset of A.
o When A intersection B is NULL, then A → B is called as complete non-trivial.

Example:
1. ID   →    Name,  
2. Name   →    DOB  

Database Normalization Examples -


Assume a video library maintains a database of movies rented out. Without
any normalization, all information is stored in one table as shown below.

Table 1

Here you see Movies Rented column has multiple values.

Database Normal Forms


Now let's move into 1st Normal Forms

1NF (First Normal Form) Rules


 Each table cell should contain a single value.
 Each record needs to be unique.
 If a relation contain composite or multi-valued attribute, it violates first normal form or a
relation is in first normal form if it does not contain any composite or multi-valued attribute.
A relation is in first normal form if every attribute in that relation is singled valued
attribute.

The above table in 1NF-

Example 1 –
ID Name Courses
------------------
1 A c1, c2
2 E c3
3 M C2, c3
In the above table Course is a multi valued attribute so it is not in 1NF.
Below Table is in 1NF as there is no multi valued attribute
ID Name Course
------------------
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3

2NF (Second Normal Form) Rules


 Rule 1- Be in 1NF
 Rule 2- Single Column Primary Key

It is clear that we can't move forward to make our simple database in


2nd Normalization form unless we partition the table above.

*To be in second normal form, a relation must be in first normal form and
relation must not contain any partial dependency. A relation is in 2NF if it
has No Partial Dependency, i.e., no non-prime attribute (attributes which
are not part of any candidate key) is dependent on any proper subset of
any candidate key of the table.
Example 1 – Consider table-3 as following below.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
Partial Dependency – If the proper subset of candidate key determines non-
prime attribute, it is called partial dependency.

{Note that, there are many courses having the same course fee. }

Here,
COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO;
COURSE_FEE together with STUD_NO cannot decide the value of
COURSE_NO;
COURSE_FEE together with COURSE_NO cannot decide the value of
STUD_NO;
Hence,
COURSE_FEE would be a non-prime attribute, as it does not belong to the
one only candidate key {STUD_NO, COURSE_NO} ;
But, COURSE_NO -> COURSE_FEE , i.e., COURSE_FEE is dependent on
COURSE_NO, which is a proper subset of the candidate key. Non-prime
attribute COURSE_FEE is dependent on a proper subset of the candidate
key, which is a partial dependency and so this relation is not in 2NF.

To convert the above relation to 2NF,


we need to split the table into two tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE

Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
2 C5
NOTE: 2NF tries to reduce the redundant data getting stored in memory.
For instance, if there are 100 students taking C1 course, we dont need to
store its Fee as 1000 for all the 100 records, instead once we can store it in
the second table as the course fee for C1 is 1000.

3NF (Third Normal Form) Rules


 Rule 1- Be in 2NF
 Rule 2- Has no transitive functional dependencies
 A relation is in third normal form, if there is no transitive dependency
for non-prime attributes as well as it is in second normal form.
 A relation is in 3NF if at least one of the following condition holds in
every non-trivial function dependency X –> Y
To move our 2NF table into 3NF, we again need to again divide our table.

3NF Example

X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).
image5

What are transitive functional dependencies?


A transitive functional dependency is when changing a non-key column,
might cause any of the other non-key columns to change

Consider the table 1. Changing the non-key column Full Name may change
Salutation.

Boyce-Codd Normal Form (BCNF)


Even when a database is in 3rd Normal Form, still there would be anomalies
resulted if it has more than one Candidate Key.

Sometimes is BCNF is also referred as 3.5 Normal Form.

Key Points –
1. BCNF is free from redundancy.
2. If a relation is in BCNF, then 3NF is also also satisfied.
3.  If all attributes of relation are prime attribute, then the relation is always in 3NF.
4. A relation in a Relational Database is always and at least in 1NF form.
5. Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.
6. If a Relation has only singleton candidate keys( i.e. every candidate key consists of
only 1 attribute), then the Relation is always in 2NF( because no Partial functional
dependency possible).
7. Sometimes going for BCNF form may not preserve functional dependency. In that
case go for BCNF only if the lost FD(s) is not required, else normalize till 3NF only.
8. There are many more Normal forms that exist after BCNF, like 4NF and more. But
in real world database systems it’s generally not required to go beyond BCNF.

4NF (Fourth Normal Form) Rules


If no database table instance contains two or more, independent and
multivalued data describing the relevant entity, then it is in 4 th Normal
Form.

5NF (Fifth Normal Form) Rules


A table is in 5th Normal Form only if it is in 4NF and it cannot be
decomposed into any number of smaller tables without loss of data.

6NF (Sixth Normal Form) Proposed


6th Normal Form is not standardized, yet however, it is being discussed by
database experts for some time. Hopefully, we would have a clear &
standardized definition for 6th Normal Form in the near future...

That's all to Normalization!!!

 Lossless Join and Dependency Preserving Decomposition

Decomposition of a relation is done when a relation in relational model is not in appropriate


normal form. Relation R is decomposed into two or more relations if decomposition is
lossless join as well as dependency preserving.

Lossless Join Decomposition

If we decompose a relation R into relations R1 and R2,

 Decomposition is lossy if R1 ⋈ R2 ⊃ R

 Decomposition is lossless if R1 ⋈ R2 = R
To check for lossless join decomposition using FD set, following conditions must hold:

1. Union of Attributes of R1 and R2 must be equal to attribute of R. Each attribute of R must

be either in R1 or in R2.

Att(R1) U Att(R2) = Att(R)

2. Intersection of Attributes of R1 and R2 must not be NULL.

Att(R1) ∩ Att(R2) ≠ Φ

3. Common attribute must be a key for at least one relation (R1 or R2)

Att(R1) ∩ Att(R2) -> Att(R1) or Att(R1) ∩ Att(R2) -> Att(R2)

For Example, A relation R (A, B, C, D) with FD set{A->BC} is decomposed into R1(ABC) and
R2(AD) which is a lossless join decomposition as:

1. First condition holds true as Att(R1) U Att(R2) = (ABC) U (AD) = (ABCD) = Att(R).

2. Second condition holds true as Att(R1) ∩ Att(R2) = (ABC) ∩ (AD) ≠ Φ

3. Third condition holds true as Att(R1) ∩ Att(R2) = A is a key of R1(ABC) because A->BC is

given.

Dependency Preserving Decomposition

If we decompose a relation R into relations R1 and R2, All dependencies of R either must be
a part of R1 or R2 or must be derivable from combination of FD’s of R1 and R2.
For Example, A relation R (A, B, C, D) with FD set{A->BC} is decomposed into R1(ABC) and
R2(AD) which is dependency preserving because FD A->BC is a part of R1(ABC).

Summary
 Database designing is critical to the successful implementation of a
database management system that meets the data requirements of
an enterprise system.
 Normalization helps produce database systems that are cost-effective
and have better security models.
 Functional dependencies are a very important component of the
normalize data process
 Most database systems are normalized database up to the third
normal forms.
 A primary key uniquely identifies are record in a Table and cannot be
null
 A foreign key helps connect table and references a primary key

Important Points for solving above type of question.


1) It is always a good idea to start checking from BCNF, then 3 NF and so on.
2) If any functional dependency satisfied a normal form then there is no need to check for lower
normal form. For example, ABC –> D is in BCNF (Note that ABC is a superkey), so no need to
check this dependency for lower normal forms.
 Candidate keys in the given relation are {ABC, BCD}
 BCNF: ABC -> D is in BCNF. Let us check CD -> AE, CD is not a super key so this
dependency is not in BCNF. So, R is not in BCNF.
 3NF: ABC -> D we don’t need to check for this dependency as it already satisfied BCNF.
Let us consider CD -> AE. Since E is not a prime attribute, so the relation is not in 3NF.
 2NF: In 2NF, we need to check for partial dependency. CD which is a proper subset of a
candidate key and it determine E, which is non-prime attribute. So, given relation is also
not in 2 NF. So, the highest normal form is 1 NF.

You might also like