0% found this document useful (0 votes)
23 views41 pages

Unit-II Design and Normalization

Uploaded by

sahayabrainyjad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views41 pages

Unit-II Design and Normalization

Uploaded by

sahayabrainyjad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

DESIGN AND NORMALIZATION

Entity-Relationship model – E-R Diagrams –


Enhanced-ER Model – ER-to-Relational Mapping
Functional Dependencies – Non-loss Decomposition
– First-Second-Third Normal Forms Dependency
Preservation – Boyce/Codd Normal Form – Multi-
valued Dependencies and Fourth Normal Form –
Join Dependencies and Fifth Normal Form.
ENTITY-RELATIONSHIP MODEL
• ER model stands for an Entity-Relationship model. It
is a high-level data model. This model is used to define
the data elements and relationship for a specified
system.
• It develops a conceptual design for the database. It also
develops a very simple and easy to design view of
data.
• In ER modeling, the database structure is portrayed as a
diagram called an entity-relationship diagram.
Example:
Entities and Attributes

• Entity: an object that is involved in the enterprise and that be


distinguished from other objects.
• Can be person, place, event, object, concept in the real world
• Can be physical object or abstraction
• Entity Type: set of similar objects or a category of entities; they
are well defined
• A rectangle represents an entity set
• Ex: students, courses
• We often just say "entity" and mean "entity type"
E-R Model
• An entity type is named and is described by set of attributes
Student: Id, Name, Address, Hobbies
• Domain: possible values of an attribute.
• Note that the value for an attribute can be a set or list of
values, sometimes called "multi-valued" attributes
• This is in contrast to the pure relational model which
requires atomic values. E.g., (111111, John, 123 Main St,
(stamps, coins))
GRAPHICAL REPRESENTATION IN E-R DIAGRAM
• Rectangle -- Entity
• Ellipses -- Attribute (underlined attributes are [part of] the
primary key)
• Double ellipses -- multi-valued attribute
• Dashed ellipses-- derived attribute, e.g. age is derivable from
birthdate and current date.
Relationships
• Relationship: connects two or more entities into an association
/relationship
• When only one instance of an entity is associated with the relationship,
then it is known as one to one relationship.
• For example, A female can marry to one male, and a male can marry to one
female.
One-to-many relationship

• When only one instance of the entity on the left, and more
than one instance of an entity on the right associates with
the relationship then this is known as a one-to-many
relationship.
• For example, Scientist can invent many inventions, but the
invention is done by the only specific scientist.
Many-to-one relationship

• When more than one instance of the entity on the left, and
only one instance of an entity on the right associates with
the relationship then it is known as a many-to-one
relationship.
• For example, Student enrolls for only one course, but a
course can have many students.
Many-to-many relationship

• When more than one instance of the entity on the left, and
more than one instance of an entity on the right associates
with the relationship then it is known as a many-to-many
relationship.
• For example, Employee can assign by many projects and
project can have many employees.
ENHANCED ENTITY-RELATIONSHIP MODEL

• The Extended Entity-Relationship Model is a more complex and high-level


model that extends an E-R diagram to include more types of abstraction,
and to more clearly express constraints.
• All of the concepts contained within an E-R diagram are included in the EE-R
model, along with additional concepts that cover more semantic
information.
• These additional concepts include generalization/specialization, union,
inheritance, and subclass/superclass.
• Specialization
• Generalization
• Attribute Inheritance
• Constraints on Generalizations
• Aggregation
• specialization
The process of designating subgroupings within an entity set is
called specialization.
•Generalization
The refinement from an initial entity set into successive levels
of entity subgroupings represents a Bottom-Up design
process in which distinctions are made explicit.
• attribute inheritance
A crucial property of the higher- and lower-level entities
created by specialization and generalization is attribute
inheritance.
Example of Specialization
ER-TO-RELATIONAL MAPPING

• After designing the ER diagram of system, we need to convert it to


Relational models which can directly be implemented by any RDBMS
like Oracle, MySQL etc.
• How to convert ER diagram to Relational Model for different
scenarios?
Example of generalization
Constraints on Generalizations
Condition-defined
• In condition-defined lower-level entity sets, membership is evaluated on
the basis of whether or not an entity satisfies an explicit condition.
User-defined
•User-defined lower-level entity sets are not constrained by a membership
condition; rather, the database user assigns entities to a given entity set.
Disjoint
•A disjointness constraint requires that an entity belong to no more
than one lower-level entity set.
Overlapping
•In overlapping generalizations, the same entity may belong to more
than one lower-level entity set within a single generalization.
FUNCTIONAL DEPENDENCIES
• The whole database is described by a single universal relation
schema R = {A1, 2, ..., An }.
Definition:
• A functional dependency, denoted by X → Y, between two sets of
attributes X and Y that are subsets of R specifies a constraint on
the possible tuples that can form a relation state r of R.
• The constraint is that, for any two tuples t1 and t2 in r that have
t1[X] = t2[X], they must also have t1[Y] = t2[Y].
• The values of the Y component of a tuple in r depend
on, or are determined by, the values of the X
component.
• The values of the X component of a tuple uniquely (or
functionally) determine the values of the Y
component.
• There is a functional dependency (FD or f.d) from X to
Y, or that Y is functionally dependent on X.
• X functionally determines Y in a relation schema R if,
and only if, whenever two tuples of r(R) agree on their
X-value, they must necessarily agree on their Y value.
Consider the relation schema EMP_PROJ in Figure;
from the semantics of the attributes and the relation,
the following functional dependencies should hold:
•Ssn→Ename
•Pnumber →{Pname, Plocation}
•{Ssn, Pnumber}→Hours
These functional dependencies specify that
•The value of an employee’s Social Security number (Ssn)
uniquely determines the employee name (Ename),
•The value of a project’s number (Pnumber) uniquely determines
the project name (Pname) and location (Plocation),
•A combination of Ssn and Pnumber values uniquely determines
the number of hours the employee currently works on the project
per week (Hours).
•Alternatively, Ename is functionally determined by (or
functionally dependent on) Ssn.
NON-LOSS(LOSSLESS-JOIN) DECOMPOSITION

• Let R be a relation schema, and let F be a set of functional


dependencies on R. Let R1 and R2 form a decomposition of
R.
• This decomposition is a lossless-join decomposition of R if at
least one of the following functional dependencies is:
• R1 ∩ R2->R1
• R1 ∩ R2->R2
ER DIAGRAM FOR LIBRARY
MANAGEMENT SYSTEM
BANK MANAGEMENT SYSTEM
ER diagram of Bank has the following description :
• The bank have Customer.
• Banks are identified by a name, code, address of main office.
• Banks have branches.
• Branches are identified by a branch_no., branch_name, address.
• Customers are identified by name, cust-id, phone number, address.
• Customer can have one or more accounts.
• Accounts are identified by account_no., acc_type, balance.
• Customer can avail loans.
• Loans are identified by loan_id, loan_type and amount.
• Account and loans are related to bank’s branch.
ER DIAGRAM FOR BANK
MANAGEMENT SYSTEM
What is Normalization?

• What is Normalization?
• Normalization is the process of organizing the data in the
database.
• Normalization is used to minimize the redundancy from a
relation or set of relations. It is also used to eliminate
undesirable characteristics like Insertion, Update, and
Deletion Anomalies.
• Normalization divides the larger table into smaller and
links them using relationships.
• The normal form is used to reduce redundancy from the
database table.
First Normal form
• It states that the domain of an attribute must include only
atomic (simple, indivisible) values and that the value of any
attribute in a tuple must be a single value from the domain
of that attribute.
• It disallows having a set of values, a tuple of values, or a
combination of both as an attribute value for a single tuple.

A table in 1NF
First technique:

• Remove the attribute Dlocations and placeit in


a separate relation DEPT_LOCATIONS, along with the
primary key Dnumber.
• The primary key of this relation is the combination
{Dnumber, Dlocation}.
• A distinct tuple in DEPT_LOCATIONS exists for each location
of a department.
• This decomposes the non-1NF relation into two 1NF
relations.
Second Technique:
• Expand the key so that there will be a separate tuple, in the
original DEPARTMENT relation for each location of a
DEPARTMENT.
• The primary key becomes the combination {Dnumber,
Dlocation}.
• Disadvantage: introducing redundancy in the relation.
Second Normal Form
• It is based on the concept of full functional dependency.
• A functional dependency X → Y is a full functional
dependency if removal of any attribute A from X means that
the dependency does not hold any more.
• A functional dependency X→Y is a partial dependency if
some attribute A € X can be removed from X and the
dependency still holds.
• In the following figure, {Ssn, Pnumber} → Hours is a full
dependency (neither Ssn → Hours nor Pnumber→Hours
holds).
• However, the dependency {Ssn, Pnumber}→Ename is partial
because Ssn→Ename holds.
• The EMP_PROJ relation is in 1NF but is not in 2NF
• The functional dependencies FD2 and FD3 make Ename,
Pname, and Plocation partially dependent on the primary
key {Ssn, Pnumber} of EMP_PROJ
• The functional dependencies FD1, FD2, and FD3 lead to the
decomposition of EMP_PROJ into the three relation
schemas EP1, EP2, and EP3 shown in figure, each of which
is in 2NF
Third Normal Form
• It is based on the concept of transitive dependency.
• A functional dependency X→Y in a relation schema R is a
transitive dependency if there exists a set of attributes Z in
R that is neither a candidate key nor a subset of any key of
R, and both X→Z and Z→Y hold. The dependency
Ssn→Dmgr_ssn is transitive through Dnumber in
EMP_DEPT in figure, because both the dependencies Ssn
→ Dnumber and Dnumber → Dmgr_ssn hold and
Dnumber is neither a key itself nor a subset of the key of
EMP_DEPT
• Definition: A relation schema R is in 3NF if it satisfies 2NF and
no nonprime attribute of R is transitively dependent on the
primary key.
• The relation schema EMP_DEPT is in 2NF but not in 3NF
because of the transitive dependency.


• EMP_DEPT is normalized by decomposing it into the two 3NF relation
schemas ED1 and ED2.
BOYCE CODD NORMAL FORM
• Definition: A relation schema R is in BCNF if whenever a nontrivial
functional dependency X→A holds in R, then X is a superkey of R.
• Example: Consider a relation TEACH with the following
dependencies:
• FD1: {Student, Course} → Instructor FD2: Instructor → Course
• {Student, Course} is a candidate key for this relation and that the
dependencies shown follow the pattern in figure, with Student as
A, Course as B, and Instructor as C.
Fourth Normal Form

• A relation schema R is in 4NF with respect to a set of


dependencies F if, for every nontrivial multivalued dependency X
→→ Y in F+, X is a superkey for R.
• Consider the EMP relation in figure. EMP is not in 4NF because in
the nontrivial MVDs
• Ename→→ Pname and Ename →→ Dname, and Ename is not a
superkey of EMP.
Definition of 5NF
• A relation R is in 5NF (or project-join normal form, PJNF) if for all join
dependencies of the form *(R1, R2, ..., Rn), where each Ri is a subset
of the set of attributes of R and R = R1 R2 ... Rn, at least one of the
following holds.
Department Subject Student
Comp. Sc. CP1000 John Smith
Mathematics MA1000 John Smith
Comp. Sc. CP2000 Arun Kumar
Comp. Sc. CP3000 Reena Rani
Physics PH1000 Raymond
Chew
Chemistry CH2000 Albert Garcia
• 1. The above relation says that Comp. Sc. offers subjects CP1000,
CP2000 and CP3000 which are taken by a variety of students.
• 2. No student takes all the subjects and no subject has all students
enrolled in it and therefore all three fields are needed to represent the
information.
• 3. The above relation does not show MVDs since the attributes subject
and student are not independent; they are related to each other and the
pairings have significant information in them.
• 4. The relation can therefore not be decomposed in two relations (dept,
subject), and (dept, student)
• Without losing some important information.
• The relation can however be decomposed in the following three relations
(dept, subject), and
• (dept, student)
• (subject, student)

You might also like