Relational Database DesignNOTES
Relational Database DesignNOTES
1. Introduction to Normalization
Normalization is a systematic process of organizing data in a database to:
Normalization is usually performed in stages called normal forms (NF). Each stage refines the structure
of the database to make it more efficient and maintainable.
2. Objectives of Normalization
Eliminate redundant data: Avoid storing the same data in multiple places.
Ensure data dependencies make sense: Data is logically stored.
Improve data integrity: Reduce the chances of anomalies (insertion, update, and deletion
anomalies).
Facilitate easier maintenance: With fewer duplicates, data is easier to maintain and update.
Occurs when the same piece of data exists in multiple places. Redundancy can lead to:
Types of Anomalies
Insertion anomaly: You can’t add data because other data is missing.
Update anomaly: You must change data in multiple places when a single item changes.
Deletion anomaly: Deleting a record causes unintended loss of data.
4. Normal Forms
4.1 First Normal Form (1NF)
Example:
It is in 1NF,
All non-key attributes are fully dependent on the entire primary key.
Applies mainly to tables with composite keys (i.e., more than one attribute in the primary key).
Example:
Here:
Not in 2NF because non-key attributes do not depend on the full key.
After 2NF:
Student Table:
StudentID StudentName
Course Table:
CourseID CourseName
Enrollment Table:
StudentID CourseID
It is in 2NF,
There is no transitive dependency (i.e., non-key attributes do not depend on other non-key
attributes).
Example:
After 3NF:
Employee Table:
| EmployeeID | Name | Department |
Department Table:
| Department | DeptLocation |
A determinant is any attribute on which some other attribute is fully functionally dependent.
Example:
If:
After BCNF:
Course Table:
| Course | Instructor |
Enrollment Table:
| StudentID | Course |
5. Data Integrity
Definition:
Data integrity ensures that data is accurate, consistent, and reliable throughout its lifecycle in a
database.
Type Description
Entity Integrity Ensures each table has a unique primary key and that it is not null.
Referential Integrity Ensures that foreign keys correctly refer to primary keys in related tables.
Domain Integrity Enforces valid entries for a given column using data type, format, or constraints.
Type Description
User-Defined Integrity Custom rules set by users/business logic (e.g., age must be ≥ 18).
1. What is DDL?
Data Definition Language (DDL) refers to SQL commands that define the structure of a database,
including:
Creating, modifying, and deleting databases, tables, and other objects like indexes, views, and
schemas.
Command Purpose
Creates a table Students with specified columns and a primary key on StudentID.
3. ALTER TABLE
Used to modify a table's structure after it has been created.
Examples:
sql
CopyEdit
ALTER TABLE Students ADD Gender VARCHAR(10);
Modify a column:
sql
CopyEdit
ALTER TABLE Students MODIFY Age SMALLINT;
Drop a column:
sql
CopyEdit
ALTER TABLE Students DROP COLUMN Email;
Examples:
Drop a table:
sql
CopyEdit
DROP TABLE Students;
Drop a database:
sql
CopyEdit
DROP DATABASE SchoolDB;
⚠️Caution: DROP is irreversible and deletes everything related to the object.
1. What is DML?
Data Manipulation Language (DML) is used to manipulate and retrieve data from existing tables.
Command Purpose
Example:
sql
CopyEdit
SELECT FirstName, LastName FROM Students;
Example:
sql
CopyEdit
INSERT INTO Students (StudentID, FirstName, LastName, Age)
VALUES (101, 'John', 'Doe', 20);
Example:
sql
CopyEdit
UPDATE Students SET Age = 21 WHERE StudentID = 101;
Example:
sql
CopyEdit
DELETE FROM Students WHERE StudentID = 101;
1. JOIN Operations
JOINs are used to combine rows from two or more tables based on a related column between them
(usually foreign key relationships).
sql
CopyEdit
SELECT A.Name, B.CourseName
FROM Students A
INNER JOIN Courses B ON A.CourseID = B.CourseID;
Returns all records from the left table and the matched records from the right. If there is no match,
NULLs appear.
sql
CopyEdit
SELECT A.Name, B.CourseName
FROM Students A
LEFT JOIN Courses B ON A.CourseID = B.CourseID;
Returns all records from the right table and the matched records from the left.
sql
CopyEdit
SELECT A.Name, B.CourseName
FROM Students A
RIGHT JOIN Courses B ON A.CourseID = B.CourseID;
Returns all records when there is a match in either table. Non-matching rows get NULLs.
sql
CopyEdit
SELECT A.Name, B.CourseName
FROM Students A
FULL JOIN Courses B ON A.CourseID = B.CourseID;
Combines LEFT and RIGHT JOIN. Shows all students and courses, matched or not.
✅ Note: Not all RDBMS (like MySQL) support FULL JOIN directly. You may need to simulate it using
UNION.
2. Aggregate Functions
Aggregate functions perform calculations on multiple rows and return a single result.
With GROUP BY
Used to group rows that share the same values, allowing aggregate functions to be applied to each
group.
sql
CopyEdit
SELECT CourseID, COUNT(*) AS StudentCount
FROM Students
GROUP BY CourseID;
sql
CopyEdit
SELECT CourseID, COUNT(*) AS StudentCount
FROM Students
GROUP BY CourseID
HAVING COUNT(*) > 5;
Creates a user named john who can connect from the local machine.
Allows user john to select, insert, and update on all tables in SchoolDB.