0% found this document useful (0 votes)

79 views38 pages

Chapter6_NormalizationDatabaseTables_Part4 (2)

Database class

Uploaded by

chamso Abou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views38 pages

Chapter6_NormalizationDatabaseTables_Part4 (2)

Database class

Uploaded by

chamso Abou

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Chapter 6

Normalization of Database
Tables
CSC 3326
Learning Objectives
• After completing this chapter, you will be able to:
• Explain normalization and its role in the database design process
• Identify and describe each of the normal forms: 1NF, 2NF, 3NF, BCNF, and 4NF
• Explain how normal forms can be transformed from lower normal forms to higher normal
forms
• Apply normalization rules to evaluate and correct table structures
• Use a data-modeling checklist to check that the ERD meets a set of minimum
requirements
Introduction
• Good database design must be matched to good table structures.

• Having good relational database software is not enough to avoid the data redundancy.

• The table is the basic building block of database design

• Ideally, the database design process yields good table structures.

• Yet, it is possible to create poor table structures (contain data redundancy).

• How do you recognize a poor table structure? and how do you produce a good table?

• Normalization is a process for evaluating and correcting table structures to minimize

data redundancies, thereby reducing the likelihood of data anomalies (update,
insert and delete).
• The normalization process involves assigning attributes to tables based on the
concepts of determination and functional dependency.
Introduction
• Normalization works through a series of stages called normal forms:
• The first three stages are:
 First normal form (1NF)
 Second normal form (2NF)
 Third normal form (3NF)

• Structural point of view of normal forms

• Higher normal forms are better than lower normal forms.
Need for Normalization
• Database designers commonly use normalization in two situations:
1) When designing a new database structure based on the business requirements of
the end users.
o After the initial design is complete, the designer can use normalization to analyze
the relationships among the attributes within each entity and determine if the
structure can be improved through normalization.

2) Modify existing data structures that can be in the form of flat files,
spreadsheets, or older database structures.
o Can use the normalization process to improve the existing data structure and
create an appropriate database design.
• Construction company that manages several building projects sample
periodic report:
Data is organized around projects
The Normalization Process
• Normalization is used to produce a set of normalized relations
(tables) that will be used to generate the required information.
• In normalization terminology:
 Any attribute that is at part of a candidate key is known as
a prime attribute instead of the more common term key attribute.
 A nonprime attribute, or a nonkey attribute, is not part of any
candidate key.
The Normalization Process
• The objective of normalization is to ensure that each table conforms to the
concept of well-formed tables:
• Each table represents a single subject
• Each row/column intersection contains only one value and not a group of values
• No data item will be unnecessarily stored in more than one table (tables have
minimum controlled redundancy)=> to ensure data is updated in only one place.
• All nonprime attributes in a table are dependent on the primary key, the
entire primary key (in case of composite key) and nothing but the primary
key.
• Each table has no insertion, update, or deletion anomalies
The Normalization Process

• The concept of keys and functional dependencies are central to the

normalization process.
• From the data modeler’s point of view, the objective of normalization is to
ensure that all tables are at least in 3NF.
• Normal forms such as the fifth normal form (5NF) and domain-key normal
form (DKNF) are not likely to be encountered in a business environment and
are mainly of theoretical interest.
=>STU_NUM is the determinant and STU_LNAME is the dependent

=>It is a functional dependency and not a full functional dependency.

The Normalization Process
• The normalization process works one relation at a time
 Identifies the dependencies of a relation (table)
 Progressively breaks the relation up into a new set of relations
• Two types of functional dependencies that are of special interest in normalization.
• Assumption: one candidate key = primary key
 A partial dependency (applicable only to composite keys): exists when there is a
functional dependence in which the determinant is only part of the primary key.
If (A, B) → (C, D), B → C, and (A, B) is the primary key, then the functional dependence B → C is a
partial dependency because only part of the primary key (B) is needed to determine the value of C.
 Transitive dependency: attribute is dependent on another attribute that is not part of the
primary key
X → Y, Y → Z, and X is the primary key.
• More difficult to identify among a set of data.
• Occur only when a functional dependence exists among nonprime attributes.
The normalization process takes through steps that lead to successively higher normal
forms
Conversion to First Normal Form (1NF)
• 1NF deals with the repeating groups and ensures that the table conforms to the
requirements for a relational table.
• A repeating group is a set of one or more multivalued attributes that are related.
• If repeating groups do exist, they must be eliminated by making sure that each row
defines a single entity instance and that each row-column intersection has only a
single value.
• Three step process:
Step 1) Eliminate the Repeating Groups: Start by presenting the data in a tabular
format, where each cell has a single value and there are no repeating groups.
Step 2) Identify the Primary Key.

Step 3) Identify all Dependencies: anomalies still exist because there are additional
dependencies in addition to the primary key dependency.
Conversion to First Normal Form (1NF)

• Dependency diagram: depicts all dependencies found within given table

structure
• Helps to get an overview of all relationships among table’s attributes
• Makes it less likely that an important dependency will be overlooked
1.The primary key attributes are bold, underlined, and in a different color.
2.The arrows above the attributes indicate all desirable dependencies—that is,
dependencies based on the primary key.
3.The arrows below the dependency diagram indicate less desirable
dependencies.
Conversion to First Normal Form (1NF)

• 1NF describes tabular format in which:

• All key attributes are defined
• There are no repeating groups in the table
• All attributes are dependent on the primary key

• All relational tables satisfy 1NF requirements

• Tables still contain partial and transitive dependencies.
• Source of update, insertion, and deletion anomalies.
Conversion to Second Normal Form (2NF)

• Conversion to 2NF occurs only when the 1NF has a composite primary key. If the 1NF has a single-attribute
primary key, then the table is automatically in 2NF (true with the assumption of one candidate key = PK)
• Step 1: Make new tables to eliminate partial dependencies
• For each component of the primary key that acts as a determinant in a partial dependency, create a new table
with a copy of that component as the primary key.
• The determinants must remain in the original table because they will be the foreign keys for the relationships
needed to relate these new tables to the original table.
• Step 2: Reassign corresponding dependent attributes:
• Use the dependency diagram in 1NF to determine attributes that are dependent in the partial dependencies
• The attributes that are dependent in a partial dependency are removed from the original table and placed in
the new table with the dependency’s determinant.
• Table is in 2NF when it:

Is in 1NF
and
Includes no partial dependencies
Conversion to Third Normal Form (3NF)
• Step 1: Make new tables to eliminate transitive dependencies

• For every transitive dependency, write a copy of its determinant as a primary key for a new table.

• The determinant must remain in the original table because it will be the foreign key for the relationship
needed to relate this new table to the original table.
• Step 2: Reassign corresponding dependent attributes

• Identify the attributes that are dependent on each determinant identified in Step1.

• A table is in third normal form (3NF) when:

It is in 2NF.
and
It contains no transitive dependencies.
Improving the design
• After cleaning the partial and transitive dependencies (3NF), the focus is to improve the database’s
ability to provide information and on enhancing its operational characteristics.

• Normalization is valuable because its use helps eliminate data redundancies => Various types of
issues need to be addressed to produce a good normalized set of tables.

1) Evaluate PK assignments
Evaluate PK against the PK characteristics

Consider the JOB_CLASS primary key ( too much descriptive content to be usable) => risk of
referential integrity violation.
Therefore, it would be better to add a JOB_CODE attribute (surrogate key) to create a unique
identifier
Surrogate key is an artificial PK introduced by the designer with the purpose of simplifying the
assignment of primary keys to tables.
Surrogate keys are usually numeric, they are often generated automatically by the DBMS.
Improving the design
2) Naming conventions
• Entity name:
• Be descriptive of the objects in the business environment
• Use terminology that is familiar to the users

• Attribute name:
• Required to be descriptive of the data represented by the attribute
• A good practice to prefix the name of an attribute with the name or abbreviation of the
entity in which it occurs: CUSTOMER/CUS_CREDIT_NUMBER
Þ CHG_HOUR will be changed to JOB_CHG_HOUR to indicate its association with the JOB
table.
Þ Attribute name JOB_CLASS does not quite describe entries such as Systems Analyst,
Database Designer, and so on; the label JOB_DESCRIPTION is used.
Improving the design
3) Refine attribute atomicity
• An atomic attribute is an attribute that cannot be further subdivided to produce meaningful
components. For example, a person’s last name attribute cannot be meaningfully subdivided.
• By improving the degree of atomicity, querying flexibility is gained.
• In general, designers prefer to use simple, single-valued attributes, as indicated by the business rules
and processing requirements.
=> EMP_NAME in the EMPLOYEE table is not atomic because EMP_NAME can be decomposed into a last
name, a first name, and an initial.

4) Identify New Attributes

• Several other attributes would have to be added.
=> An employee hire date attribute (EMP_HIREDATE) could be used to track an employee’s job longevity.

5) Identify New Relationships

=> The employee and project using the manage relationship
Improving the design
6) Refine primary keys as required for data granularity
• Granularity: Level of detail represented by the values stored in a table’s row
• Changing granularity requirements might dictate changes in primary key selection.
=> Does ASSIGN_HOURS represent the daily total, weekly total, monthly total, or yearly total?
=> Using a surrogate primary key such as ASSIGN_NUM provides lower granularity and yields greater flexibility.

7) Maintain Historical Accuracy

 Writing the job charge per hour into the ASSIGNMENT table is crucial to maintaining the historical accuracy of the
table’s data.

8) Evaluate Using Derived Attributes

 Use a derived attribute in the ASSIGNMENT table to store the actual charge made to a project.
 The derived attribute, named ASSIGN_CHARGE, is the result of multiplying ASSIGN_HOURS by
ASSIGN_CHG_HOUR.
 Storing the derived attribute in the table makes it easy to write the application software to produce the desired
results.
Multiple candidate keys
• The concept of keys is central to the normalization process.
• A candidate key has the same characteristics as a primary key, but for some reason, it was not chosen to be the
primary key.
• Normalization rules focus on candidate keys, not just the primary key.
• The previous normalization process (1NF,2NF,3NF) should be generalized for multiple candidate keys (instead of a single
candidate key = PK)
• 1NF
 Identify all candidate keys
 Make sure that all non-prime attributes are determined by all candidate keys.

• 2NF
 Partial depencies should be identified for all candidate keys

• 3NF
 The remaining non-prime attributes should not have transitive dependencies
The CLASS table has two candidate keys:
•CLASS_CODE
•CRS_CODE + CLASS_SECTION

The table is in 1NF because the key attributes are defined and all nonkey attributes are
determined by the both candidate keys.

The table is in 2NF because it is in 1NF and there are no partial dependencies on either
candidate key.
Finally, the table is in 3NF because there are no transitive dependencies.
Normalization and Database Design

• Normalization should be part of the design process

• The proposed entities meet the required normal form before the table structures are
created.
• If the designer follow the design procedures, the likelihood of data anomalies will be small
• Even the best database designers are known to make occasional mistakes that come to
light during normalization checks.
• Designer should be aware of good design principles and procedures as well as normalization
procedures.
1) ERD is created through an iterative process => Macro view of an organization’s data
requirements and operations.
2) Normalization focuses on the characteristics of specific entities=> normalization represents
a micro view of the entities within the ERD.
Normalization and Database Design
(Example)
• Business rules for the contracting company:
• The company manages many projects.
• Each project requires the services of many employees.
• An employee may be assigned to several different projects.
• Some employees are not assigned to a project and perform duties not specifically related
to a project.
• Each employee has a single primary job classification, which determines the hourly billing
rate.
• Many employees can have the same job classification. For example, the company employs
more than one electrical engineer
EMPLOYEE contains a transitive dependency.

The removal of EMPLOYEE’s transitive dependency yields three entities:

PROJECT (PROJ_NUM, PROJ_NAME)
EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INITIAL, JOB_CODE)
JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HOUR)
Data-Modeling Checklist
• Designer should go over this checklist to ensure that all modeling tasks were successfully done.
• Business rules
• Properly document and verify all business rules with the end users
• Ensure that all business rules are written precisely, clearly, and simply
• The business rules must help identify entities, attributes, relationships, and constraints
• Identify the source of all business rules, and ensure that each business rule is justified, dated, and signed off by an
approving authority

• Data modeling
• Naming conventions: all names should be limited in length

• Entity names:
• Should be nouns that are familiar to business and should be short and meaningful
• Should document abbreviations, synonyms, and aliases for each entity
• Should be unique within the model
• For composite entities, may include a combination of abbreviated names of the entities linked through the
composite entity
Data-Modeling Checklist

• Attribute names:
• Should be unique within the entity
• Should use the entity abbreviation as a prefix
• Should be descriptive of the characteristic
• Should use suffixes such as _ID, _NUM, or _CODE for the PK attribute
• Should not be a reserved word
• Should not contain spaces or special characters such as @, !, or &

• Relationship names:
• Should be active or passive verbs that clearly indicate the nature of the
relationship
Data-Modeling Checklist
• Entities:
• Each entity should represent a single subject
• Each entity should represent a set of distinguishable entity instances
• All entities should be in 3NF or higher
• Granularity of the entity instance should be clearly defined
• PK should be clearly defined and support the selected data granularity
Data-Modeling Checklist
• Attributes:
• Should be simple and single-valued (atomic data)
• Should document default values, constraints, synonyms, and aliases
• Derived attributes should be clearly identified and include source(s)
• Should not be redundant unless this is required for transaction accuracy,
performance, or maintaining a history
• Nonkey attributes must be fully dependent on the PK attribute

• Relationships:
• Should clearly identify relationship participants
• Should clearly define participation, connectivity, and document cardinality
Data-Modeling Checklist
• ER model:
• Should be validated against expected processes: inserts, updates,
and deletions
• Should evaluate where, when, and how to maintain a history
• Should minimize data redundancy to ensure single-place updates
• Should conform to the minimal data rule: All that is needed is there,
and all that is there is needed

33 - Erasmus - The Bible Exposed PDF
No ratings yet
33 - Erasmus - The Bible Exposed PDF
373 pages
376420_LEC06_Normalization_Up
No ratings yet
376420_LEC06_Normalization_Up
51 pages
Data Normalization
No ratings yet
Data Normalization
38 pages
Chapter 5-T323 Introduction to the Relational Database
No ratings yet
Chapter 5-T323 Introduction to the Relational Database
37 pages
Chapter 17 Normalization
No ratings yet
Chapter 17 Normalization
2 pages
Chapter 05
No ratings yet
Chapter 05
56 pages
Normalization of Database Tables
100% (1)
Normalization of Database Tables
59 pages
Chapter 5 - Normalization of Database Tables
No ratings yet
Chapter 5 - Normalization of Database Tables
27 pages
Chapter 6 - Normalization of Database Tables
No ratings yet
Chapter 6 - Normalization of Database Tables
23 pages
Chapter05 Updated
No ratings yet
Chapter05 Updated
52 pages
DATABASE NOTES Database Normalization
No ratings yet
DATABASE NOTES Database Normalization
13 pages
Quiz 8
No ratings yet
Quiz 8
4 pages
Dbms Normalization
No ratings yet
Dbms Normalization
5 pages
Norma
No ratings yet
Norma
62 pages
Data Normalization
No ratings yet
Data Normalization
43 pages
IM Module 3, Lesson 3
No ratings yet
IM Module 3, Lesson 3
51 pages
Lec10 Normalization PDF
No ratings yet
Lec10 Normalization PDF
50 pages
Normalisation
No ratings yet
Normalisation
23 pages
4 Database Design
No ratings yet
4 Database Design
6 pages
Unit 4
No ratings yet
Unit 4
19 pages
Dbms Chapter 6 Normalization
No ratings yet
Dbms Chapter 6 Normalization
2 pages
Chapter: 5 Normalization of Database Tables: in This Chapter, You Will Learn
No ratings yet
Chapter: 5 Normalization of Database Tables: in This Chapter, You Will Learn
43 pages
Normalization of Database Tables
No ratings yet
Normalization of Database Tables
52 pages
Normalization and Normal Form
No ratings yet
Normalization and Normal Form
11 pages
Fundamental of Database CH-5
No ratings yet
Fundamental of Database CH-5
34 pages
CMPG311_SU5-CH7
No ratings yet
CMPG311_SU5-CH7
37 pages
Dbms Theory Notes Unit IV
No ratings yet
Dbms Theory Notes Unit IV
73 pages
Dbms Assignment ON Normalization: Submitted By, R.Kiruba Sankar
No ratings yet
Dbms Assignment ON Normalization: Submitted By, R.Kiruba Sankar
10 pages
Chapter 5 Normalization of Database Tables
No ratings yet
Chapter 5 Normalization of Database Tables
58 pages
6-chapter six
No ratings yet
6-chapter six
7 pages
Normalization of Relations
No ratings yet
Normalization of Relations
6 pages
Normalization Extra
No ratings yet
Normalization Extra
5 pages
DBMS Lesson 5.1
No ratings yet
DBMS Lesson 5.1
17 pages
Normalization in DBMS11
No ratings yet
Normalization in DBMS11
12 pages
VII. Normalización
No ratings yet
VII. Normalización
16 pages
LESSON-7.-Normalization-of-Database-Tables
No ratings yet
LESSON-7.-Normalization-of-Database-Tables
34 pages
Normalisation Concepts in Database
No ratings yet
Normalisation Concepts in Database
5 pages
Normalization Coronel
No ratings yet
Normalization Coronel
19 pages
Chap - 6 Normalization Database Tables
No ratings yet
Chap - 6 Normalization Database Tables
41 pages
DBMS Unit3 PartA Notes
No ratings yet
DBMS Unit3 PartA Notes
5 pages
Normalization Lec4
No ratings yet
Normalization Lec4
29 pages
Database Normalisation 101
No ratings yet
Database Normalisation 101
9 pages
Normal
No ratings yet
Normal
10 pages
MYSQL DAY - 20 (Normalization)
No ratings yet
MYSQL DAY - 20 (Normalization)
13 pages
Normalization
No ratings yet
Normalization
27 pages
Lecture 5 Normalisation Student
No ratings yet
Lecture 5 Normalisation Student
39 pages
2nd and 3rd Unit
No ratings yet
2nd and 3rd Unit
87 pages
Study Material: Vivekananda College Thakurpukur
No ratings yet
Study Material: Vivekananda College Thakurpukur
10 pages
DBS Normalization
No ratings yet
DBS Normalization
30 pages
Data Normalization
No ratings yet
Data Normalization
41 pages
Normalization
No ratings yet
Normalization
57 pages
Assignment No. 3: ND ST
No ratings yet
Assignment No. 3: ND ST
11 pages
Normalization
No ratings yet
Normalization
2 pages
Redundancy Dependency Loss of Information
No ratings yet
Redundancy Dependency Loss of Information
61 pages
Databases Lecture 5
No ratings yet
Databases Lecture 5
34 pages
Normal
No ratings yet
Normal
41 pages
Database Normalisation: (WEEK 5) Outline
No ratings yet
Database Normalisation: (WEEK 5) Outline
5 pages
Normalization of Database Tables
No ratings yet
Normalization of Database Tables
21 pages
Normalization
No ratings yet
Normalization
17 pages
Sql Simplified:: Learn to Read and Write Structured Query Language
From Everand
Sql Simplified:: Learn to Read and Write Structured Query Language
Cecelia L. Allison
No ratings yet
Pivot Tables In Depth For Microsoft Excel 2016
From Everand
Pivot Tables In Depth For Microsoft Excel 2016
Suljan Qeska
3.5/5 (3)
CH03-COA10e.top Level View (1)
No ratings yet
CH03-COA10e.top Level View (1)
40 pages
Chapter8_AdvancedSQL_Part5V2
No ratings yet
Chapter8_AdvancedSQL_Part5V2
45 pages
Chapter10_TransactionManagementandConcurrencyControl
No ratings yet
Chapter10_TransactionManagementandConcurrencyControl
31 pages
Chapter14_BigData&NoSQLDatabases
No ratings yet
Chapter14_BigData&NoSQLDatabases
39 pages
Design Data Warehouse For Real Time Applications: 1. Open The Power BI
No ratings yet
Design Data Warehouse For Real Time Applications: 1. Open The Power BI
16 pages
Ba Unit 3
No ratings yet
Ba Unit 3
8 pages
jBASE Dataguard
No ratings yet
jBASE Dataguard
140 pages
Viva Questions
No ratings yet
Viva Questions
6 pages
The Second Example of TimesTen With Oracle
No ratings yet
The Second Example of TimesTen With Oracle
8 pages
Cibr
No ratings yet
Cibr
12 pages
A Study On E-Commerce Recommender System Based On Big Data
No ratings yet
A Study On E-Commerce Recommender System Based On Big Data
5 pages
5 - Data Model Changes in SD - 23012018 - S2
100% (1)
5 - Data Model Changes in SD - 23012018 - S2
34 pages
How To Define/Implement Type 2 SCD in SSIS Using Slowly Changing Dimension Transformation
No ratings yet
How To Define/Implement Type 2 SCD in SSIS Using Slowly Changing Dimension Transformation
11 pages
Maintenance Guidelines For M-Files Administrators
No ratings yet
Maintenance Guidelines For M-Files Administrators
19 pages
AI NOTES-VIII COMPLETE NOTES
No ratings yet
AI NOTES-VIII COMPLETE NOTES
6 pages
DBMS MQP's-1,2,3 PDF
No ratings yet
DBMS MQP's-1,2,3 PDF
6 pages
Storing and Using Objects in A Relational Database
No ratings yet
Storing and Using Objects in A Relational Database
21 pages
AIS Quiz 1 and 2
No ratings yet
AIS Quiz 1 and 2
5 pages
Discuss Human Resource Information Subsystems
No ratings yet
Discuss Human Resource Information Subsystems
7 pages
UID IT 604 Part A
No ratings yet
UID IT 604 Part A
4 pages
Siamese Neural Networks For Content Base
No ratings yet
Siamese Neural Networks For Content Base
5 pages
How To Recover MSSQL From Suspect Mode Emergency Mode Error 1813
No ratings yet
How To Recover MSSQL From Suspect Mode Emergency Mode Error 1813
3 pages
PROJECT PROPOSAL catograph
No ratings yet
PROJECT PROPOSAL catograph
6 pages
CSA-demo
0% (1)
CSA-demo
13 pages
Lesson 45 Literature Review
No ratings yet
Lesson 45 Literature Review
3 pages
Computer Science Project File
No ratings yet
Computer Science Project File
26 pages
Bam Cockpit UCberkeley Del
No ratings yet
Bam Cockpit UCberkeley Del
21 pages
Give Us A Clue: Russell Stannard
No ratings yet
Give Us A Clue: Russell Stannard
2 pages
Test Project: It Software Solutions For Business
No ratings yet
Test Project: It Software Solutions For Business
8 pages
DWM TE QP
No ratings yet
DWM TE QP
7 pages
A Framework For Enterprise Security Architecture and Its Application in Information Security Incident Management
No ratings yet
A Framework For Enterprise Security Architecture and Its Application in Information Security Incident Management
13 pages
William Chang Resume Azure
No ratings yet
William Chang Resume Azure
6 pages

Uploaded by

Uploaded by

Chapter 6

• The table is the basic building block of database design

• Ideally, the database design process yields good table structures.

• Normalization is a process for evaluating and correcting table structures to minimize

• Structural point of view of normal forms

• The concept of keys and functional dependencies are central to the

=>It is a functional dependency and not a full functional dependency.

• Dependency diagram: depicts all dependencies found within given table

• 1NF describes tabular format in which:

• All relational tables satisfy 1NF requirements

• A table is in third normal form (3NF) when:

4) Identify New Attributes

5) Identify New Relationships

7) Maintain Historical Accuracy

8) Evaluate Using Derived Attributes

• Normalization should be part of the design process

The removal of EMPLOYEE’s transitive dependency yields three entities:

You might also like