Book Database
Book Database
Introduction to DBMS
DBMS enables the creation, management, and manipulation of databases, allowing users to efficiently
store, retrieve, update, and manage data.
A relational database organizes data into tables where each table consists of rows and columns. These
tables are interconnected using keys, which enable relationships between tables to be established.
1. Data Integrity: RDB enforce to ensure the accuracy and reliability of data.
2. Flexibility: New data can easily be added without affecting existing applications.
3. SQL Support: They are widely supported by SQL which is used for managing data.
Example: An online shopping system could use a relational database where customer information is
stored in one table and order details are stored in another. A foreign key links customer orders to their
respective customer data, ensuring proper relationships between entities.
A file system and a DBMS are both used to store, retrieve, and manage data, but they differ in
functionality, performance, and how they handle data.
File System
A file system is a method of storing and organizing data in files on a storage medium like a hard disk
or SSD. It provides basic data management capabilities, enabling users to create, read, update, and
delete files.
1. Simple Storage: Files are stored in directories or folders, and the system manages where
these files are physically located.
2. No Central Control: Data access is not controlled centrally, which can lead to issues like
duplication and data inconsistency.
3. Limited Querying: File systems offer no built-in way to query data. Accessing data is often
done programmatically, which can be inefficient.
4. Lack of Security: File systems provide basic permissions but lack the sophisticated security
controls found in modern DBMSs.
5. No Data Integrity: There is no built-in mechanism to enforce data accuracy or relationships
between different files.
• Data Redundancy: Files may be duplicated, leading to wasted storage and possible
inconsistencies.
• Data Inconsistency: Multiple copies of the same file may get out of sync, causing
inconsistent data.
• Lack of Data Security: Minimal security features make it hard to manage who can access or
modify files.
• No Concurrency: A file system does not handle simultaneous access well, which can lead to
data corruption.
• No Transaction Management: File systems lack mechanisms to ensure atomic operations,
leading to possible data corruption if failures occur during data manipulation.
A software designed to manage DB. It offers functionalities to store, retrieve, manipulate, and maintain
large amounts of data in a structured way.
1. Data Organization: Data is organized in tables, records, and fields, allowing efficient data
retrieval.
2. Centralized Control: A DBMS centralizes the management of data, ensuring that multiple
users can access the same data without conflict.
3. Querying with SQL: A DBMS uses SQL to query and manipulate data, providing a powerful
way to access and manage large datasets.
4. Security: Provides advanced user management, allowing access controls and role-based
security to restrict who can access and modify data.
5. Concurrency Control: Supports multiple users accessing the database at the same time while
maintaining data consistency.
Advantages of a DBMS:
• Efficient Data Retrieval: Powerful query tools enable fast data retrieval.
• Data Security: Advanced security features to protect sensitive information.
• Backup and Recovery: Supports automated backup and recovery options to ensure data is
not lost.
Disadvantages of a DBMS:
It is the topmost layer of the architecture. It is responsible for presenting the user interface and
interacting directly with the end users. This tier collects input from the user, sends it to the application
layer for processing, and displays the results.
Key Responsibilities:
Examples:
It handles the application's main tasks.This level gets asked for by the front end, follows the company's
rules, and talks to the database to get or change information.
Key Responsibilities:
• Implements business logic and rules that define how data is created, displayed, stored, and
changed.
• Handles communication between the presentation layer and the database layer.
• Ensures data validation and applies the necessary business workflows.
The Database Tier contains the database management system (DBMS) that stores, retrieves, and
manipulates the data. The application layer communicates with this tier to execute queries, manage
transactions, and retrieve the necessary data.
Key Responsibilities:
1. Separation of Concerns: Each layer has a distinct role, leading to better organization and
easier maintenance of the system.
2. Scalability: The architecture can easily scale horizontally or vertically. For example, the
presentation layer can have multiple clients (web browsers, mobile apps), and the data layer
can use multiple database servers for load distribution.
3. Security: Each tier can implement its own security measures, ensuring that sensitive data is
only accessible by authorized entities. The database is protected from direct access by the
presentation layer.
4. Flexibility: Changes made to one layer do not directly affect the other layers. For instance,
the user interface can be updated without modifying the business logic or database structure.
5. Reusability: The application logic (business rules) can be reused across different user
interfaces (e.g., web, mobile).
6. Maintenance: Modular design makes it easier to debug, update, and maintain the system.
Example of Three-Tier Architecture in Action:
1. Presentation Layer: The user interacts with the website (via a web browser), browsing
products, adding items to the cart, and placing an order.
2. Application Layer: The application server processes the user’s actions. It checks product
availability, calculates total costs, applies discounts, and processes the order. It communicates
with the database to retrieve product details and update stock levels.
3. Data Layer: The database stores all product information, customer orders, inventory levels,
and payment details. The application layer sends queries to fetch or update this information
based on user actions.
View of Data
In a DBMS, the view of data refers to the different ways in which the data stored in a database can be
accessed and manipulated by different users or applications. This concept ensures that users have the
right level of abstraction based on their needs and roles, helping maintain security, simplicity, and
efficiency.
The view of data is typically structured in three levels, which together provide a layered abstraction of
the data stored in the database:
Physical Level explains how data is kept in DB, like the data structures, how files are organized, and
the details of storing data.This is the basic level of how data is actually saved on devices like disks or
memory
• Focuses on data storage details like indexes, file paths, storage blocks, and data compression.
• Managed by the Database Administrator (DBA), not end users.
• Concerned with optimizing performance and storage efficiency.
Example:
In Oracle DBMS, data might be stored in blocks on disk. The physical level would involve managing
how tables, indexes, and other structures are organized on these blocks.
The Logical Level shows how data is kept in the database and how different pieces of data connect
with each other.It simplifies the database's logical design, ignoring the actual storage details of the
data.
Example:
The logical schema might define a student table with attributes like student_id, name, and department,
without specifying how or where this data is stored on the physical disk.
The view level provides multiple views or representations of the database to various users based on
their roles and requirements. A view in this context is a subset of the database that can be tailored to
specific user needs, showing only the data that is relevant to a particular user or application while hiding
the rest.
Example:
A university’s DB might have a student view that only shows names and student IDs to instructors,
while administrators might have a view that includes all details like names, addresses, and financial
information.
Example:
This query creates a view that only shows the student_id, name, and department for students in the
Computer Science department.
1. Abstraction: Different users and applications do not need to understand how data is physically
stored or structured. They interact with a simplified view of the data.
2. Security: The view level allows restricting access to sensitive data. For example, a user might
only be allowed to view certain columns or records.
3. Simplified Access: By creating views, developers and users can work with more relevant and
specific subsets of data, avoiding the complexity of large databases.
4. Data Independence: The separation between the physical, logical, and view levels ensures
changes in one layer don’t impact the others, improving system flexibility and manageability.
In a DBMS, instances and schemas are key concepts that help define and organize how data is structured
and managed. These terms relate to different levels of abstraction within a DBS.
1. Schema
A schema is the structure of the database.The structure stays pretty much the same over time and
shows how the database should be logically organized.
• Defines Structure: The schema describes the structure of the data in the database (e.g.,
tables, columns, relationships).
• Permanent: The schema is generally a static part of the database and changes very rarely.
• Table 1: Students
o Columns: student_id, name, age, department
• Table 2: Courses
o Columns: course_id, course_name, credits
• Relationship: There might be a relationship where students enroll in multiple courses.
This structure forms the schema of the database, which defines the organization and relationships
within the database.
2. Instance
A database instance (DBMS) is where you can find the actual information stored right now.The
schema tells us how the data is organized, while the instance shows us the actual data at a specific
time.
• Current Data: An instance is the current state of the data in the DB. It represents the data
stored in the database at a particular point in time.
• Dynamic: Unlike the schema, which is static, instances can change frequently.
• Multiple Instances: A DB can have multiple instances over time as data is added, deleted, or
updated.
Example of an Instance:
For the Students table in the university database, an instance might look like this:
At this point in time, these two rows represent the instance of the Students table. As new students are
added or existing records are updated or deleted, the instance of the Students table changes accordingly.
1. Schema is blueprint of the database, whereas an instance is the snapshot of the data in the
database at a particular moment.
2. Schema is static (doesn't change frequently), while instances are dynamic (they change
whenever data in the database is modified).
3. A database can have many instances over time, but it usually has only one schema, unless
there are significant changes in the database structure.
Think of a schema as the design of a building, like blueprints or architectural drawings. This design
stays the same (unless you renovate or restructure the building), defining where the rooms, doors, and
windows are.
An instance is the current state of the building at any given time. It includes the people, furniture, and
equipment that occupy the building. Just as these elements change over time while the building’s
structure stays the same, the data in a database (instance) can change while the schema (design)
remains static.
• Naive/Parametric Users: These users interact with the database regularly through predefined
applications or user interfaces, such as ATMs or retail applications.
• Sophisticated Users: These are skilled users who interact directly with the database using
complex SQL queries. They may be analysts or developers who require detailed data analysis.
2. Application Programmers
Application programmers design and build applications that interact with the database. They write
code to execute complex operations using languages like Java, Python, or C#, connecting to the
database through APIs.
• Example: A developer creating a web application that retrieves product information from a
database.
3. Database Designers
Database designers, also known as database architects, are responsible for designing the database
schema, defining the data models, and establishing the relationships between data elements. They
ensure the database structure meets the organization’s requirements for performance, scalability, and
efficiency.
4. System Analysts
System analysts design the database system’s workflows and processes. They determine how data
flows between different parts of the database and work closely with programmers and DBAs to
implement the system.
• Example: An analyst planning how customer data is stored and retrieved during a transaction
in a retail system.
It is responsible for the overall management of the DBS. DBAs ensure the database is running smoothly,
is secure, and is accessible to authorized users.
• Installing the DBMS software (e.g., Oracle, MySQL) and configuring it according to
organizational needs.
• Setting up database instances, storage structures, and memory allocations.
3. Performance Tuning
• Monitoring the database’s performance and making necessary adjustments to ensure
optimal performance.
• Indexing, query optimization, and adjusting memory allocation to improve response
times.
• Managing disk storage to ensure enough space is available for the database to grow.
Example: Creating an index to improve query performance:
4. Data Integrity
• Implementing validation rules to prevent incorrect data from being entered into the
database.
5. Database Maintenance
• Performing routine maintenance tasks like reorganizing tables, clearing temporary
files, and monitoring log files.
• Scheduling maintenance windows to minimize downtime.
6. User Management
• Creating, managing, and deleting database user accounts.
• Monitoring user activities and auditing logs to ensure compliance with security
policies.
7. Database Upgrades
• Upgrading the database system to new versions or patches to ensure the system is up
to date with the latest features and security enhancements.
8. Transaction Management
• Managing database transactions to ensure ACID (Atomicity, Consistency, Isolation,
Durability) properties are maintained.
• Handling transaction failures and ensuring that transactions are rolled back when
necessary.
Data models help in creating the schema of the database, which defines its logical structure. Different
types of data models offer varying levels of abstraction, ranging from high-level conceptual designs to
low-level physical structures.
It arranges data like a tree where each record links to one parent, and records are related in a parent-
child way.The model is straightforward and prevalent in historical database applications.
Key Features:
Example:
A typical example of a hierarchical data model is an organizational structure where the company’s
CEO is at the top, followed by managers, team leads, and employees.
This model lets records link to many parents and children, like a connected map.
Key Features:
Example:
An example of a network model is a transportation network where cities (nodes) are connected by roads
(relationships), and a city can be connected to multiple other cities.
The ER Model shows the links between real-world things.People often use it during the planning part
of building a database to make a picture of the data before turning it into a structured format.
Key Features:
Example:
In an ER diagram for a university system, Students and Courses are entities, and the relationship
between them could be "Enrolls."
Database System Architecture refers to the design and structure of a database system that defines how
different components of the database interact with each other. This architecture organizes the internal
and external structure of the database to ensure data is efficiently stored, managed, and retrieved. There
are various architectures based on how the components are arranged, but the most common is the two-
tier and three-tier architecture. Additionally, the internal architecture of databases can also be
categorized into different layers for data abstraction.
1. One-Tier Architecture
This architecture is mostly used for local applications where the user interacts directly with the DBMS.
There is no network involved, and all components such as the database and user interface exist within
a single system.
Key Characteristics:
Example:
A simple personal finance application on your desktop, where the database and the application run
locally without any network or server.
2. Two-Tier Architecture
The client sends requests to server, which processes these requests and sends the data back to the client.
The business logic is often embedded in the client application.
Key Characteristics:
Example:
A desktop application where the client application (running on a user’s computer) sends SQL queries
to a database server (e.g., MySQL, Oracle). The server processes the query and returns the results.
Advantages:
• Faster communication between client and server as they are directly connected.
• Easier to set up and manage for small systems.
Disadvantages:
3. Three-Tier Architecture
Three-Tier Architecture is the most commonly used architecture in enterprise-level systems. Each layer
is independent and can be modified or scaled without affecting the others.
Key Characteristics:
• Separate layers for the user interface, application logic, and database.
• Allows for better scalability, security, and maintenance.
• Suitable for large-scale applications with multiple users and complex operations.
Components:
Example:
• The client (presentation layer) is the web browser that displays the online store.
• The application layer processes user requests, such as adding products to the cart, managing
inventory, and processing payments.
• The database layer stores product information, customer details, orders, and transactions.
Advantages:
• Scalability: Each layer can be scaled independently, allowing the system to grow with user
demand.
• Security: Business logic and data are separated, which improves security by controlling
access to the data layer.
• Maintenance: Modifying one layer (e.g., updating the business logic) does not affect the
others, making it easier to maintain.
Disadvantages:
• More complex to set up and maintain compared to one-tier and two-tier architectures.
• Involves additional overhead due to communication between layers.
Key Characteristics:
• Improved data availability and reliability, as failure in one location does not affect the entire
system.
• Complex management and synchronization are required to ensure consistency across
distributed databases.
Example:
A multinational corporation may have databases in different countries, where each office stores its
local data, but the data is accessible across the company.
In a Database Management System (DBMS), several types of database languages are used to interact
with and manage databases. These languages allow users to define, manipulate, control, and retrieve
data from the database. Oracle SQL (Structured Query Language) is one of the most widely used
languages for managing relational databases.
This section will cover the different types of database languages, how they are used in Oracle SQL,
and the software involved in managing databases.
It allows you to create, alter, and delete database structures without affecting the data stored in them.
• CREATE: Used to create a new database object such as a table, view, or index.
Example:
• UPDATE:
Example:
• SELECT:
It controls access to the data in the database. It includes commands that allow database administrators
to grant or revoke permissions to users.
Example:
Example:
TCL commands help ensure that transactions are completed fully or not at all.
• COMMIT: Saves all the changes made by a transaction permanently to the database.
Example:
COMMIT;
Example:
ROLLBACK;
• SAVEPOINT: Creates a point within a transaction to which you can later roll back.
Example:
SAVEPOINT sp1;
• SET TRANSACTION: Defines a new transaction with specific properties, such as isolation
level.
Example:
It is generally referred to as the language that performs all operations on a relational database.
Oracle Database is one of the most popular and widely used relational database management systems
(RDBMS) developed by Oracle Corporation. It uses SQL as its primary query language, along with
its extended version known as PL/SQL (Procedural Language/SQL). Oracle SQL provides
comprehensive support for DDL, DML, DCL, and TCL commands, making it ideal for complex
enterprise-level database operations.
• High Availability: Oracle ensures data availability and minimizes downtime using features
like Oracle Real Application Clusters (RAC) and Data Guard.
• Scalability: Oracle can handle large volumes of data and high numbers of users without
performance degradation.
• Security: Oracle Database provides advanced security features like user roles, encryption,
and auditing.
• Data Management: Oracle supports various data management features such as partitioning,
indexing, and materialized views to optimize query performance.
• Backup and Recovery: Oracle offers advanced backup and recovery tools like Oracle
RMAN (Recovery Manager).
• PL/SQL: Oracle SQL supports PL/SQL, a procedural extension to SQL, allowing users to
write complex scripts, procedures, and functions.
1. Information Rule
Data should be stored in tables, with each value stored in a cell, and all data should be accessible through
this table structure. Everything, including metadata, is represented as data within the system.
Example:
Data about employees would be stored in a table with columns like employee_id, name, department,
and rows for each employee.
Every value in a relational database must be directly accessible through a combination of the table it
resides in, the row (identified by the primary key), and the column (attribute).
Example:
To access an employee’s name, you use the table name Employees, the primary key (e.g.,
employee_id), and the column name name.
A relational database must support null values to represent missing or inapplicable data, and this
support must be consistent across all data types.
Example:
If an employee’s date of termination is unknown, a null value can be stored in the termination_date
column.
Example:
Users should be able to query system tables to retrieve information about other tables, columns, and
relationships.
There should be a single language, such as SQL, that supports the complete set of operations on the
database, including the creation, update, and querying of data.
Example:
SQL is an example of such a comprehensive language, which allows users to create tables
(CREATE), insert data (INSERT), and retrieve data (SELECT).
A view is a virtual table based on the result of a query. If a view is created from a set of tables, the
system must allow updates through the view if the updates can be logically applied to the underlying
tables.
Example:
If a view is created to show only active employees, updating a salary in the view should reflect in the
base Employees table.
Relational databases should allow for operations that manipulate sets of data (multiple rows at a time),
not just single rows.
Example:
You should be able to update the salary for all employees in a specific department with a single SQL
query.
• Definition: Changes to how data is stored (e.g., moving from one storage medium to another)
should not affect the way users access data.
• Explanation: The system’s internal storage mechanisms should be transparent to the user.
Changes in physical storage or indexing should not affect application programs or queries.
Example:
If the database administrator decides to store data in a different physical location or change indexing,
users’ queries remain unaffected.
9. Logical Data Independence
Changes in the logical structure of the database (like adding new fields or tables) should not require
modifications to user applications or queries unless those changes are directly relevant to them.
Example:
If a new email column is added to the Employees table, existing queries that don’t involve email
should continue to function as before.
The database itself must enforce integrity rules, such as primary keys, foreign keys, and data validation
(e.g., constraints like NOT NULL). These rules should not be handled by application-level code.
Example:
A primary key constraint for employee_id should be defined directly in the database schema, ensuring
uniqueness across the system.
Example:
Users should be able to query data seamlessly whether it is stored on multiple servers or just one,
without needing to know where the data is physically located.
• Definition: If the system supports a low-level language to access data, that language must not
be able to bypass any of the integrity rules or constraints imposed by the higher-level
language (e.g
., SQL).
Example:
Even if an administrator uses a low-level language to manipulate the database directly, they should
not be able to insert a duplicate value in a column that is constrained by a UNIQUE key.
Database Schema and Basic SQL Programs
1. Tables: The main component of a schema, representing entities in the database. Each table
contains rows (records) and columns (attributes).
2. Columns (Attributes): Define the properties or characteristics of each entity in the table.
Each column has a specific data type.
3. Constraints: Rules applied to data in the database to ensure accuracy and integrity.
In RDBMS, keys are essential for uniquely identifying records and establishing relationships between
tables. They help maintain data integrity and ensure that the data in the database is both consistent and
accurate.
Types of Keys:
1. Primary Key
A Primary Key is a unique identifier for each record in a table. No two rows can have the same
primary key value, and it cannot contain NULL values.
2. Foreign Key
A Foreign Key is a field (or a group of fields) in one table that refers to the primary key in another
table. It establishes a relationship between the two tables and ensures referential integrity.
3. Candidate Key
A Candidate Key is any column or a combination of columns that can uniquely identify a row in a
table. A table can have multiple candidate keys, but only one is chosen as the primary key.
4. Composite Key
A Composite Key is made up of two or more columns that together can uniquely identify a record. It
is used when a single column is insufficient to guarantee uniqueness.
5. Alternate Key
An Alternate Key is a candidate key that is not selected as the primary key. It can still uniquely
identify a row in the table.
6. Super Key
A Super Key is any set of columns that can uniquely identify a row in a table. A super key may
contain additional attributes not necessary for unique identification.
Schema Diagrams
A Schema Diagram is a visual representation of a database's logical structure, illustrating the tables,
columns, and relationships within a database. It provides a clear overview of how data is organized and
interconnected in a relational database, helping database designers and developers understand the layout
and relationships between entities.
In a university database:
The diagram would show tables for Students, Courses, and Enrollments, with lines indicating the
relationships (foreign keys) between them.
Relational Query Languages are languages used to interact with and manage data in a Relational
Database Management System (RDBMS). These languages allow users to perform various operations
such as retrieving, inserting, updating, and deleting data from relational databases.
2. Relational Algebra
o Key Operations:
▪ Selection (σ): Extracts rows based on a condition.
▪ Projection (π): Extracts specific columns.
3. Relational Calculus
o Relational Calculus: It focuses on defining the desired results without specifying the
sequence of operations.
o Two Types:
▪ TRC: Works with tuples.
▪ DRC: Works with domains (attributes).
It is a procedural query language that provides a set of operations for manipulating and retrieving data
from relational databases. It forms the theoretical foundation for SQL and is used to define operations
on relations (tables). Unlike SQL, which is declarative, relational algebra specifies how to retrieve the
data.
1. Selection (σ)
o Purpose: Retrieves rows (tuples) from a relation that satisfy a specific condition.
o Syntax: σ condition (relation)
2. Projection (π)
o Purpose: Extracts specific columns (attributes) from a relation.
o Syntax: π attribute_list (relation)
3. Union ( ∪ )
o Purpose: Combines two relations, returning all distinct rows present in either or both
relations.
o Syntax: relation1 ∪ relation2
4. Set Difference ( − )
o Purpose: Returns rows that are in the first relation but not in the second relation.
o Syntax: relation1 − relation2
5. Cartesian Product ( × )
o Purpose: Combines all rows from the first relation with all rows from the second
relation.
o Syntax: relation1 × relation2
6. Rename (ρ)
o Purpose: Renames the output relation or attributes.
o Syntax: ρ new_name (relation)
Data Retrieval and Joins
Overview of the features of SQL
SQL (Structured Query Language) is the standard language used to manage and manipulate relational
databases. It provides a wide range of features that allow users to define database structures, insert and
update data, retrieve information, and manage access control. SQL is both a declarative and user-
friendly language, making it the most widely used query language in relational database management
systems (RDBMS).
INSERT INTO Students (s_id, name, age) VALUES (150, 'Timur', 20);
▪ UPDATE:
▪ DELETE:
The SELECT statement is used to query and retrieve data from tables. SQL supports complex queries
with filtering, grouping, and sorting.
BEGIN TRANSACTION;
UPDATE Students SET age = 22 WHERE student_id = 1;
COMMIT;
6. Joins
SQL allows combining data from multiple tables using different types of joins (INNER JOIN, LEFT
JOIN, RIGHT JOIN).
7. Aggregate Functions
o SQL provides functions like COUNT(), SUM(), AVG(), MAX(), and MIN() to
perform calculations on data.
o Example:
8. Indexes
o SQL allows the creation of indexes to optimize query performance.
o Example:
9. Views
o SQL allows the creation of virtual tables, known as views, that provide a way to
simplify complex queries and enhance security by restricting access to specific data.
o Example:
10. Constraints
o SQL allows you to enforce data integrity using constraints like PRIMARY KEY,
FOREIGN KEY, UNIQUE, CHECK, and NOT NULL.
o Example:
The SELECT statement is the most commonly used command in SQL for retrieving data from a
database. It allows you to query and retrieve specific columns or all columns from one or more tables
based on certain conditions.
Basic Syntax:
This retrieves all columns (e.g., student_id, name, age, department) for every record
in the Students table.
This retrieves only the name and age columns from the Students table.
This retrieves students' names and ages, sorted by age in descending order.
In SQL, functions are used to perform operations on data. These functions can be classified into two
types based on the number of rows they operate on: Single Row Functions and Multiple Row
(Aggregate) Functions.
Single Row Functions operate on each row of the result set and return a value for each row. These
functions can be used to manipulate strings, numbers, dates, and other data types.
This rounds the salary column to 2 decimal places for each row.
Multiple Row Functions, also known as Aggregate Functions, perform operations on a group of rows
and return a single result. They are typically used with the GROUP BY clause to group rows based on
a condition and apply the function to each group.
sql
Copy code
SELECT COUNT(*) FROM Students;
sql
Copy code
SELECT SUM(salary) FROM Employees;
sql
Copy code
SELECT AVG(age) FROM Students;
In SQL, JOIN operations are used to combine data from two or more tables based on a related
column between them. Joins allow you to retrieve data from multiple tables and are fundamental to
relational database queries. There are several types of joins, each serving a different purpose.
1. INNER JOIN
2. LEFT JOIN (LEFT OUTER JOIN)
3. RIGHT JOIN (RIGHT OUTER JOIN)
4. FULL JOIN (FULL OUTER JOIN)
5. CROSS JOIN
6. SELF JOIN
1. INNER JOIN
An INNER JOIN retrieves records that have matching values in both tables. Only the rows that meet
the join condition in both tables are returned.
• Syntax:
• Example:
This retrieves the names of students and the courses they are enrolled in, combining data from
three tables.
A LEFT JOIN returns all records from the left table and the matched records from the right table. If
there is no match, NULL values are returned for columns from the right table.
• Syntax:
• Example:
This retrieves all students and the courses they are enrolled in. If a student is not enrolled in
any course, NULL is returned for course_id.
A RIGHT JOIN returns all records from the right table and the matched records from the left table. If
there is no match, NULL values are returned for columns from the left table.
• Syntax:
• Example:
This retrieves all courses, and the names of students enrolled in them, if any. If a course has
no enrolled students, NULL is returned for the student's name.
A FULL JOIN returns all records when there is a match in either the left or right table. If there is no
match, NULL values are returned for columns where there is no match.
• Syntax:
• Example:
This retrieves all students and courses, whether or not they are enrolled in a course or have
students.
5. CROSS JOIN
A CROSS JOIN returns the Cartesian product of the two tables. Each row in the first table is combined
with all rows in the second table, resulting in a large number of combinations.
• Syntax:
• Example:
This retrieves all possible combinations of students and courses, without considering any
relationships.
6. SELF JOIN
A SELF JOIN is a join where a table is joined with itself. This is useful when comparing rows within
the same table.
• Syntax:
• Example:
This retrieves the name of employees and their managers from the same Employees table.
Advanced SQL joins go beyond the basic INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN
operations, providing more complex ways to retrieve and manipulate data across multiple tables.
These advanced joins include techniques like Self Joins, Cross Joins, Natural Joins, and Outer
Joins with complex conditions. These joins are essential for querying more intricate relationships
within a relational database.
1. Self Join
2. Cross Join
3. Natural Join
4. Full Outer Join
5. Complex Joins with Multiple Conditions
6. Anti Join (NOT EXISTS or NOT IN)
1. Self Join
A Self Join is when a table is joined with itself to compare rows within the same table. This is useful
for hierarchical data or comparing rows in the same dataset.
• Syntax:
• Example:
This query retrieves employees and their respective managers from the same Employees
table.
2. Cross Join
A Cross Join returns the Cartesian product of two tables, meaning every row from the first table is
combined with every row from the second table. This can result in a large number of combinations.
• Syntax:
• Example:
This retrieves all possible combinations of students and courses, without considering any
relationships between them.
3. Natural Join
A Natural Join automatically joins two tables based on all columns with the same name and data
type. It simplifies joins when columns share the same name, but it can be risky if there are multiple
matching columns you didn't intend to join.
• Syntax:
• Example:
SELECT *
FROM Employees NATURAL JOIN Departments;
This automatically joins the Employees and Departments tables where the columns have the
same name (e.g., department_id).
A Full Outer Join retrieves all records when there is a match in either the left or right table. It returns
NULL for non-matching rows in either table. This is useful when you need to retrieve all records from
both tables, even when there is no matching data.
• Syntax:
SELECT *
FROM table1
FULL OUTER JOIN table2 ON table1.common_column = table2.common_column;
• Example:
This retrieves all students and courses, whether or not they have enrollments.
You can create complex joins by adding multiple conditions in the ON clause. This allows for more
granular control over which rows are included in the result.
• Syntax:
SELECT *
FROM table1
JOIN table2 ON table1.column1 = table2.column2 AND table1.column3 = table2.column4;
• Example:
This retrieves orders and customer names, but only for orders placed after January 1, 2023.
An Anti Join retrieves rows from one table that do not have matching rows in another table. It is
typically implemented using NOT EXISTS or NOT IN.
• Syntax:
SELECT column1
FROM table1
WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.common_column =
table2.common_column);
• Example:
SELECT name
FROM Students
WHERE NOT EXISTS (SELECT * FROM Enrollments WHERE Students.student_id =
Enrollments.student_id);
This retrieves all students who are not enrolled in any course.
In SQL, tables are the core structure used to store data in relational databases. Creating and managing
tables involves defining the structure of the table, including columns and data types, as well as
maintaining the table by modifying its structure or deleting it when necessary.
1. Creating Tables
You create a table using the CREATE TABLE statement, where you define the table name, columns,
data types, and constraints (if any).
• Syntax:
• Example:
This creates an Employees table with columns for employee_id, name, department, salary,
and hire_date.
2. Managing Tables
After creating a table, you may need to manage it by altering its structure or deleting it.
a. Altering Tables
The ALTER TABLE statement allows you to modify an existing table by adding, modifying, or
dropping columns, as well as adding constraints.
• Add a Column:
• Modify a Column:
• Drop a Column:
b. Dropping a Table
The DROP TABLE statement is used to delete a table and all its data from the database.
• Syntax:
• Example:
c. Truncating a Table
The TRUNCATE TABLE statement removes all rows from a table but retains its structure.
• Syntax:
• Example:
This removes all data from the Employees table without deleting the table itself.
3. Constraints in Tables
When creating or altering tables, you can apply constraints to enforce rules and ensure data integrity.
In SQL, data manipulation refers to the process of inserting, updating, deleting, and retrieving data
from tables in a relational database. These operations are collectively referred to as Data
Manipulation Language (DML) commands. DML is essential for managing and interacting with the
actual data stored in the database.
The INSERT statement is used to add new rows of data into a table.
• Syntax:
• Example:
The UPDATE statement is used to modify existing records in a table. You can update one or more
columns in a specific row or multiple rows based on a condition.
• Syntax:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
• Example:
UPDATE Employees
SET salary = 55000
WHERE employee_id = 1;
The DELETE statement is used to remove rows from a table. You can delete specific rows based on a
condition or all rows if no condition is specified.
• Syntax:
• Example:
This deletes the employee with employee_id 1 from the Employees table.
The SELECT statement is used to query and retrieve data from a table. You can specify which
columns to retrieve and filter the results using conditions.
• Syntax:
• Example:
The WHERE clause is used in SELECT, UPDATE, and DELETE statements to filter data based on
specific conditions.
• Example:
The ORDER BY clause is used to sort the result set based on one or more columns, either in
ascending (ASC) or descending (DESC) order.
• Syntax:
• Example:
This retrieves employees' names and salaries, sorted by salary in descending order.
The LIMIT clause (or TOP in some databases) is used to limit the number of rows returned by a
query.
• Syntax:
• Example:
Subqueries and Compound Queries are advanced SQL techniques used to perform more complex
data retrieval operations. A subquery is a query within another query, while compound queries
combine multiple queries using set operators like UNION, INTERSECT, and EXCEPT. These
techniques allow for flexible and powerful data manipulation.
1. Subqueries
A subquery (also called an inner query or nested query) is a query embedded within another SQL
query. The result of the subquery is used by the outer query. Subqueries can be used in SELECT,
INSERT, UPDATE, and DELETE statements, as well as in conditions (WHERE, HAVING).
Types of Subqueries:
Syntax:
Example:
SELECT name
FROM Employees
WHERE salary > (SELECT AVG(salary) FROM Employees);
This retrieves the names of employees whose salaries are greater than the average salary.
This retrieves employees whose salaries are greater than the average salary in their respective
departments.
2. Compound Queries
Compound Queries are used to combine the results of two or more queries into a single result set.
They are performed using set operators like UNION, UNION ALL, INTERSECT, and EXCEPT (or
MINUS in some databases).
o Example:
This retrieves a list of unique names from both the Employees and Customers tables.
• UNION ALL: Combines the results of two queries without removing duplicates.
o Example:
• INTERSECT: Returns only the rows that are common to both queries.
o Example:
This retrieves the names that exist in both the Employees and Customers tables.
• EXCEPT (or MINUS): Returns rows from the first query that are not present in the second
query.
o Example:
This retrieves names that are in the Employees table but not in the Customers table.
Creating views
A view in SQL is a virtual table that is based on the result of a query. It does not store data physically
but provides a way to simplify complex queries, improve security by restricting data access, and
present data in a specific format. A view can be queried like a regular table but represents a
predefined query’s result set.
You create a view using the CREATE VIEW statement, specifying the view name and the query that
defines the view.
Syntax:
Example:
This creates a view named EmployeeView that displays the name, department, and salary of
employees in the HR department.
3. Querying a View
Once a view is created, you can query it just like a regular table.
Example:
This retrieves all records from the EmployeeView, showing employees in the HR department.
4. Updating a View
Some views allow updating, inserting, or deleting data if they are based on a single table without
complex joins or aggregations. However, not all views are updatable.
Syntax:
UPDATE view_name
SET column1 = value1
WHERE condition;
5. Dropping a View
Syntax:
Example:
DROP VIEW EmployeeView;
Sequences and Indexes are tools used in SQL to enhance the functionality and performance of
databases. Sequences are used to automatically generate unique numbers, often for primary keys,
while indexes improve query performance by enabling faster data retrieval.
1. Sequences
A Sequence in SQL is an object that generates a sequence of unique numeric values, typically used to
auto-generate primary key values. It ensures that each new row gets a unique identifier.
Key Features:
Syntax:
Example:
This creates a sequence named employee_seq that starts at 1 and increments by 1 for each new value.
Using a Sequence:
This inserts a new row into the Employees table, automatically generating a unique employee_id
using the sequence.
2. Indexes
An Index is a database object that improves the speed of data retrieval operations on a table. It works
similarly to an index in a book, allowing faster lookups for specific rows in large datasets.
Key Features:
• Improves query performance, especially on large tables.
• Can be created on one or more columns.
• Reduces the need for full table scans when querying.
Syntax:
Example:
This creates an index on the name column in the Employees table, improving the performance of
queries that search for employees by name.
Example Query:
This query will run faster with the idx_employee_name index in place because it can directly locate
the rows based on the indexed name column.
Queries in SQL
Creating synonyms
A synonym in SQL is an alias or alternative name for a database object, such as a table, view,
sequence, or another synonym. Synonyms simplify access to objects, especially when the original
object resides in a different schema or database. Synonyms provide a way to hide the complexity of
database object names, improve readability, and make accessing remote objects easier.
• Simplify Object Access: Provides a shorter or more convenient name for accessing a
database object.
• Hide Complexity: Masks the underlying schema or database object names.
• Improve Security: Controls access to an object without exposing its real name.
• Access Remote Objects: Allows easy access to objects in different schemas or databases,
especially in distributed environments.
2. Creating a Synonym
You create a synonym using the CREATE SYNONYM statement. There are two types of synonyms:
private (accessible only to the user who created it) and public (accessible to all users).
Syntax:
Example:
This creates a synonym named emp for the Employees table in the hr schema, allowing you to refer to
the table simply as emp in queries.
This creates a public synonym customer_info for the Customers table in the sales schema, making it
accessible to all users.
3. Using a Synonym
Once a synonym is created, you can use it in place of the actual object name in SQL queries.
Example:
This query retrieves data from the Employees table using the synonym emp.
4. Dropping a Synonym
Syntax:
Example:
This drops the synonym emp, but it does not affect the underlying Employees table.
Database triggers
A database trigger is a set of SQL statements that automatically executes or "fires" in response to a
specific event on a particular table or view. Triggers are used to enforce business rules, maintain data
integrity, and automate tasks within the database. They can be triggered by events like INSERT,
UPDATE, or DELETE operations on a table.
1. Types of Triggers
1. Before Trigger: Executes before the triggering event (e.g., before an INSERT, UPDATE, or
DELETE operation).
2. After Trigger: Executes after the triggering event has occurred.
3. Instead of Trigger: Used specifically with views to define behavior for INSERT, UPDATE,
or DELETE operations on the view.
4. Row-Level Trigger: Executes once for each row affected by the triggering event.
5. Statement-Level Trigger: Executes once for the entire operation, regardless of how many
rows are affected.
2. Creating a Trigger
Triggers are created using the CREATE TRIGGER statement. The trigger is associated with a table
and fires when a specified event occurs.
Syntax:
Example:
This trigger executes before an INSERT on the Employees table and checks if the salary is greater
than 0. If not, it raises an error.
3. Trigger Components
• Triggering Event: The event that causes the trigger to fire (INSERT, UPDATE, or
DELETE).
• Timing: When the trigger should fire (BEFORE, AFTER, or INSTEAD OF the event).
• Triggering Table: The table or view associated with the trigger.
• Trigger Body: The SQL statements that execute when the trigger fires.
• Row-Level vs. Statement-Level: Determines whether the trigger runs for each row or just
once per statement.
4. Dropping a Trigger
To remove a trigger, use the DROP TRIGGER statement.
Syntax:
Example:
• Enforcing Business Rules: Automatically checking conditions like valid ranges for values
before data is inserted or updated.
• Audit Trails: Automatically recording changes made to the database for auditing purposes.
• Maintaining Referential Integrity: Automatically updating or deleting related records when
a parent record is updated or deleted.
Stored procedures
A Stored Procedure is a precompiled set of SQL statements and logic that can be stored in a
database and executed as a single unit. Stored procedures are used to encapsulate business logic,
simplify repetitive tasks, and improve the performance and security of database operations.
• Encapsulation of Logic: Allows complex logic, such as multiple SQL statements and control
flow, to be bundled together and reused.
• Improved Performance: Since stored procedures are precompiled, they run faster than ad-
hoc queries.
• Security: Users can execute the procedure without needing direct access to the underlying
tables or raw SQL code.
• Simplified Maintenance: Changes to business logic are easier to manage in stored
procedures than in application code.
Stored procedures are created using the CREATE PROCEDURE statement. They can accept
parameters (input, output, or both) and include multiple SQL statements with control logic.
Syntax:
This procedure, AddEmployee, inserts a new employee into the Employees table with the provided
name and salary.
Stored procedures are executed using the EXEC or CALL command, depending on the database
system.
Syntax:
Example:
This executes the AddEmployee procedure, adding a new employee named "John Doe" with a salary
of 50,000.
This procedure, GetEmployeeSalary, retrieves the salary of an employee based on their employee_id
and returns it via the output parameter emp_salary.
5. Dropping a Stored Procedure
Syntax:
Example:
The ER model provides a high-level, graphical representation of the data. It consists of three main
components:
1. Entities: Objects or things in the system that have data stored about them (e.g., Students,
Courses).
2. Attributes: Characteristics or properties of an entity (e.g., name, age, course_name).
3. Relationships: Associations between entities (e.g., Students enroll in Courses).
Example of ER Diagram:
• Entities: Represent real-world objects that are distinguishable from one another. An entity
can be a person, place, thing, or event.
o Example: Employee, Department
• Attributes: Properties or details of an entity. Each entity will have multiple attributes.
o Example: Employee entity can have attributes like employee_id, name, and salary.
• Relationships: Describe how entities are related to each other. Relationships can be one-to-
one, one-to-many, or many-to-many.
o Example: An Employee works in a Department (one-to-many relationship).
Once the ER model is created, the database design process involves converting the conceptual model
into a physical schema that can be implemented in a relational database.
1. Requirement Analysis: Understand the data needs of the system by talking to stakeholders
and analyzing business processes.
2. Conceptual Design (ER Modeling): Create the ER diagram to visualize entities, attributes,
and relationships.
3. Logical Design: Translate the ER model into a relational schema, identifying tables, primary
keys, foreign keys, and constraints.
4. Normalization: Apply normalization techniques to eliminate redundancy and ensure data
integrity.
5. Physical Design: Define how data will be stored, indexed, and accessed in the database. This
includes creating tables, defining data types, and setting up relationships between tables.
ER Model:
Relational Schema:
• Clear Structure: ER models provide a clear and visual representation of data and
relationships, making it easier to understand the structure of the database.
• Improved Data Integrity: Proper database design ensures the consistency and accuracy of
data through primary keys, foreign keys, and constraints.
• Normalization: Helps remove redundancy and organizes data efficiently, reducing the risk of
anomalies during data operations.
• Scalability: A well-designed database can be easily scaled and modified to accommodate
changing business needs.
Transaction in Databases
1. ACID Properties
1. Atomicity:
o Ensures that all operations within a transaction are completed successfully, or none of
them are. If any part of the transaction fails, the entire transaction is rolled back,
leaving the database unchanged.
o Example: If transferring money between two accounts, both the debit from one
account and the credit to the other must succeed or both must fail.
2. Consistency:
o Ensures that a transaction transforms the database from one valid state to another.
After a transaction completes, all data must be in a valid and consistent state,
adhering to all defined rules, constraints, and relationships.
o Example: A bank account balance cannot be negative if the database has a constraint
enforcing this rule.
3. Isolation:
o Ensures that transactions are executed independently of each other. The operations of
one transaction should not affect or be visible to other concurrently running
transactions.
o Example: Two customers withdrawing money from the same account at the same
time should not interfere with each other’s transactions.
4. Durability:
o Ensures that once a transaction is committed, the changes made are permanent, even
in the event of a system crash or failure.
o Example: After completing a purchase in an e-commerce system, the transaction
details should be saved permanently, even if the system crashes afterward.
2. Transaction Lifecycle
Example:
BEGIN TRANSACTION;
COMMIT;
If both updates succeed, the transaction commits. If one fails, the transaction rolls back.
Example:
BEGIN TRANSACTION;
SAVEPOINT sp1;
ROLLBACK TO sp1;
This allows rolling back to the SAVEPOINT without undoing the entire transaction.
• Banking Systems: Ensuring that debit and credit operations are executed together.
• E-commerce: Handling orders and payments as atomic units to ensure accurate inventory and
financial records.
• Batch Processing: Grouping multiple database updates into a single transaction for
performance and consistency.
Database Security involves protecting the integrity, confidentiality, and availability of data stored in
a database. It encompasses various techniques, policies, and tools to safeguard data from unauthorized
access, misuse, or corruption.
Key Components:
• Authentication: Ensures that only authorized users can access the database by verifying
identities (e.g., usernames and passwords).
• Authorization: Controls what authenticated users are allowed to do by granting or revoking
permissions to read, write, update, or delete data.
• Encryption: Protects sensitive data by encoding it so that unauthorized users cannot read it.
• Auditing and Monitoring: Tracks database activity and logs actions to identify suspicious
behavior or security breaches.
• Access Control: Limits user access to specific areas of the database using role-based or
discretionary access control mechanisms.
Introduction to Cloud Databases
Cloud Databases are databases that run on cloud computing platforms, offering scalability,
flexibility, and ease of management. Cloud providers handle the infrastructure, maintenance, and
backup, allowing users to focus on database usage.
Key Features:
• Scalability: Cloud databases can scale up or down based on demand, ensuring optimal
performance without manual intervention.
• Cost-Effectiveness: Users pay only for the resources they consume, making it a cost-efficient
option for many organizations.
• Managed Services: Cloud providers handle administrative tasks like backups, patching, and
security.
• High Availability: Cloud databases are often configured for high availability, ensuring
minimal downtime.
• Popular Cloud Database Providers: Amazon RDS, Google Cloud SQL, Microsoft Azure
SQL Database.
A Data Warehouse is a centralized repository that stores large volumes of historical data from
multiple sources for analysis and reporting. Big Data refers to extremely large datasets that cannot be
easily managed or analyzed with traditional tools.
Data Warehousing:
• ETL (Extract, Transform, Load): The process of extracting data from various sources,
transforming it into a consistent format, and loading it into the data warehouse.
• OLAP (Online Analytical Processing): A method used for analyzing data stored in a data
warehouse to support decision-making.
Big Data:
• Volume, Velocity, Variety: The three main characteristics of Big Data, referring to the
massive size, fast generation, and diverse types of data.
• Big Data Technologies: Hadoop, Apache Spark, and NoSQL databases like Cassandra and
HBase help manage and analyze Big Data.
NoSQL Databases are designed to handle large volumes of unstructured, semi-structured, and
structured data. Unlike traditional relational databases, NoSQL databases offer flexibility, scalability,
and high performance, particularly in handling Big Data.
Key Features:
• Horizontal Scalability: NoSQL databases can scale out by adding more servers.
• Schema Flexibility: No predefined schema is required, making it easier to handle varying
data types.
• High Performance: Optimized for high-speed reads and writes.
MongoDB Basics
MongoDB is a popular NoSQL database that stores data in a flexible, document-oriented format using
BSON (Binary JSON). It is widely used for handling large volumes of unstructured data.
Key Concepts:
• Documents: The primary unit of data in MongoDB, similar to rows in a relational database
but stored in JSON-like format.
• Collections: Groups of documents, similar to tables in relational databases.
• Databases: A logical container for collections.
Basic Operations:
• Inserting Data:
• Querying Data:
• Updating Data:
• Deleting Data:
Key Features:
• Flexible Schema: Documents can have different fields, allowing for flexible data modeling.
• High Performance: Optimized for read and write operations, particularly in Big Data
environments.
• Scalability: MongoDB supports horizontal scaling through sharding.