Database Management Systems Theory (4th Sem) .
Database Management Systems Theory (4th Sem) .
IL
COLLEGE ROLL NO: 226617
H
SA
Program BCA
Course Name ➖
Semester 4th.
Database Management systems .
UNIT ➖ 01
● Introduction of DBMS ➖
A database management system (DBMS) is a software tool that helps organisations create,
manage, and retrieve data from a database. It acts as an interface between the database and
the end-user, ensuring that the data is well-organised and accessible.
IL ➖
1. Security
H
Some characteristics of DBMS include:
2. Self-describing nature
3. Insulation between programs and data abstraction
4. Support of multiple views of the data
SA
➖
Some types of DBMS include:
1. Relational databases (RDBMS) The most widely used type of DBMS. It uses a
table-based structure to store and organise data. Each table represents a specific entity,
and data is stored in rows and columns. RDBMS uses structured query language (SQL)
to interact with databases.
IL
H
The steps involved in data modelling include : Requirements analysis, Conceptual
modelling, Logical modelling, Physical modelling, Maintenance and optimization.
➖
SA
1. External level
The highest level of abstraction, this level is concerned with how users and user groups see
the data.
2. Conceptual level ➖
This level is concerned with the way the DBMS and operating system differentiate the data.
3. Internal level ➖
This level is concerned with how the DBMS and operating system store the data using data
structures and files.
The three-level architecture of a database can be viewed as three levels of abstraction. The
architecture allows for multiple data views to be created to meet the needs of different users.
For example, faculty salary information can be hidden from student view, but shown in admin
view.
The ANSI-SPARC Architecture (American National Standards Institute, Standards Planning
And Requirements Committee) is an abstract design standard for a DBMS. It was first
proposed in 1975, but never became a formal standard.
➖
4
● Components of a DBMS
A database management system (DBMS) has five main components: Hardware, Software,
Data, Procedures, Database access language.
A DBMS is a software program that database developers use to create tables, relationships,
and other structures in a database.
➖
into a database to perform their business operations and responsibilities.
2. Database design
This is the process of creating and organising a database structure to meet an organisation's
IL
needs and requirements. A well-designed database ensures data storage, reliability, and
➖
security.
3. Data security
This is the process of protecting data from unauthorised access. It includes both physical and
➖
logical security measures.
4. Data structure
This is a key feature of a DBMS. All the data is stored in a categorised structure so that it
H
5. DBMS architecture ➖
becomes easier for the user to call upon the data when it is required.
This refers to the overall design of a DBMS which includes its components, modules, and their
interrelationships. It describes how data is stored, accessed, managed, and controlled in a
Data models provide a blueprint for designing a new database or reengineering a legacy
application. They help design the database at the conceptual, physical, and logical levels.
Data models help avoid the drawbacks of poorly designed data. They are similar to a map that
aids in the organisation of data for better use.
● Hierarchical ➖
In software engineering, a hierarchical structure is a way of organising elements into levels or
layers.
Each level has a parent-child relationship with the next one. The top level is the root, and the
IL
lowest level is the leaf. Each element in a level can have one or more children, but only one
parent.
H
SA
Hierarchical architecture is a type of distributed system in which the modules are organized
into multiple control levels. This approach is typically used in designing system software such
as network protocols and operating systems.
In a hierarchical system design, each level in the hierarchy represents a different level of
abstraction. Higher levels are more abstract and lower levels are more detailed.
IL
Here are some differences between the two models: ➖
1. Network model ➖
Represents data as record types. Uses links to represent the relationship between entities.
➖
Network databases can be represented as a graph, defined by a schema.
2. Relational model
Represents data as relations or tables. Access is usually through a language like SQL.
H
Here are some other differences: ➖
1. Network models can have more than one parent.
2. Network models are complex and difficult to design, implement, and maintain.
3. Network models require a high level of technical skill and knowledge.
SA
It's rare to find a database that uses both models simultaneously.
● Comparison of Network ➖
Networks are made up of endpoints that send and receive data, while software is a collection
of data that runs devices and computers.
Network engineers focus on fixing issues with computer networks, while software engineers
focus on creating applications and software.
Here are some other differences between network and software engineers:
2. Work schedules: Most software engineers work full-time, 40 hours a week, Monday
through Friday. They may work from home or in the office.
A Wireless Local Area Network (WLAN) is a LAN that doesn't use cables to connect to the
network. WLANs are used in the same situations as LANs, but the choice depends on whether
you want a remote cloud solution or an on-premises solution.
IL
A hierarchical data model organises data in a tree-like structure. Data elements are
represented as nodes with parent-child relationships. Due to this approach, hierarchical
databases are especially adept at representing structured data with well-defined relationships.
Each parent can have multiple children, but each child has only one parent.
H
SA
What Is A Relational Data Model?
A relational data model represents data as tables consisting of rows and columns. Each row in
a table represents a specific record, while each column represents an attribute or field.
Stores data hierarchically in tree structure; Uses Organises data in table form; Uses common
parent-child relationships fields to establish relationships between tables
IL
Complex and difficult to design Comparatively easy for users
ER Models are represented graphically by ER Diagrams. ER Diagrams are visual tools used to
model and design databases. They represent the relationship between entities in a database
system, as well as the attributes of these entities and their interactions.
ER Modeling is a systematic process for designing a database. It requires analyzing all data
requirements before implementing the database.
2. Cardinality ➖
The number of values in a set. In ER Diagrams, cardinality specifies how many instances of an
entity relate to one instance of another entity. The three main cardinal relationships are
one-to-one, one-to-many, and many-many.
IL
H
SA
➖ 02.
10
UNIT
● Relational Database ➖
A relational database (RDB) is a way to organise information in tables, rows, and columns. It's
also a type of database that stores and provides access to data points that are related to one
another.
IL
H
In a relational database, each row in the database table is a record with a unique identifier
called a key. Each column holds attributes of the data.
➖
Here are some benefits of the relational database approach:
1. Create meaningful information
Joining tables allows you to understand the relations between the data, or how the tables
SA
connect.
Relational databases are based on the relational model, which is a straightforward way of
representing data in tables. The Structured Query Language (SQL) is used to manipulate
relational databases.
IL
H
Relational algebra is a procedural query language that uses a step-by-step procedure for
carrying out a query. Relational calculus is a non-procedural or declarative query language that
SA
describes the information. Relational algebra targets how to obtain the result, while relational
calculus targets what result to obtain.
Relational algebra is more operational and useful for representing execution plans. Relational
calculus lets users describe what they want, rather than how to compute it.
For a database language to be relationally complete, the query written in it must be expressible
in relational calculus.
➖
12
Difference between Relational Algebra and Relational Calculus:
Basis of
S.NO Relational Algebra Relational Calculus
Comparison
IL
Relational Algebra
Relational Calculus means what result
2. Procedure means how to obtain
we have to obtain.
the result.
3. Order
H In Relational Algebra,
the order is specified in In Relational Calculus, the order is not
which the operations specified.
have to be performed.
SA
Relational Algebra is Relation Calculus can be
4. Domain independent of the domain-dependent because of domain
domain. relational calculus.
Relational Algebra is
Relational Calculus is not nearer to
Programming nearer to a
5. programming language but to natural
language programming
language.
language.
➖
13
● SQL Fundamentals
Structured Query Language (SQL) is a fundamental building block of modern database
architecture. It's the primary language used for relational databases, where all the data sets
are connected.
SQL is considered a foundational programming skill. Some say that every software developer
or programmer should know how to write SQL queries to retrieve data from a database.
IL
SQL queries are fundamental processes. For example, the “SELECT” query is used to retrieve
data from the database.
SQL joins are a fundamental aspect of SQL and are used in many database management
systems. They allow you to retrieve data from multiple tables and combine it in a single result
set. This is useful for creating complex queries and performing data analysis.
H
SQL transactions are an important feature of SQL and its ability to maintain quality data in the
database. Transactions adhere to the ACID properties (Atomicity, Consistency, Isolation,
Durability). They can be started, committed, and rolled back.
B. Key Commands: ➖
❖ CREATE: Used to create database objects like tables, indexes, views, etc.
❖ ALTER: Used to modify the structure of existing database objects.
❖ DROP: Used to delete database objects.
❖ TRUNCATE: Removes all records from a table, but retains the structure for future use.
❖ COMMENT: Adds comments to the data dictionary.
➖
14
1. DML (Data Manipulation Language):
A. Definition: DML is used for managing data within the database. It includes operations
like inserting, updating, retrieving, and deleting data from tables.
B. Key Commands:
❖ SELECT: Retrieves data from one or more tables.
❖ INSERT: Adds new records to a table.
❖ UPDATE: Modifies existing records in a table.
❖ DELETE: Removes records from a table.
IL
A. Definition: DCL is concerned with the rights, permissions, and other access control
aspects of the database. It includes commands for granting and revoking permissions.
B. Key Commands:
❖ GRANT: Provides specific privileges to a user or role.
❖ REVOKE: Removes specific privileges from a user or role.
➖
H
1. PL/SQL (Procedural Language/Structured Query Language):
These concepts are fundamental to understanding and working with relational databases, and
they play a crucial role in designing, managing, and interacting with database systems in
software engineering.
● Cursors ➖
In software engineering, a cursor is a temporary memory or workstation that's allocated by a
database server. It's used to store database tables and to refer to a program that fetches and
processes rows returned by an SQL statement.
➖
➖
There are two types of cursors:
1. Implicit cursors: These are automatically created when select statements are
executed.
2. Explicit cursors: ➖ These need to be defined by the user by providing a name. They
can fetch multiple rows and close automatically after execution.
➖
15
Cursors can also refer to:
❖ Text insertion cursors: These indicate where text can be inserted.
❖ Pointing cursors: These indicate where the mouse pointer is located.
❖ Selection cursors: These select text or other items.
❖ Busy cursors: These indicate that the computer is busy processing data.
● Stored Procedures ➖
In software engineering, a stored procedure is a collection of pre-compiled SQL statements
that are saved in a database. Stored procedures are also known as subroutines and can be
used by applications that access a relational database management system (RDBMS).
Stored procedures can help improve the performance, security, and maintainability of software
applications. They can also be easily modified, reused, and can automate tasks that require
IL
multiple SQL statements.
2. Remote procedure call: Stored procedures are a form of remote procedure call that
H
operates in a client-server environment.
Stored procedures can accept input parameters, perform defined operations, and return
SA
multiple output values.
To get the most out of stored procedures, it is important to follow standards and conventions
for designing and implementing them.
● Stored Functions ➖
A stored function is a defined function that is called from within an SQL statement and returns
a single value. It is a PL/SQL unit of code that consists of SQL and PL/SQL statements that
perform a set of related tasks or solve specific problems.
Stored functions are one of the types of stored programs in MySQL. To create a stored
function, you must have a CREATE ROUTINE database privilege.
Here are some differences between stored functions and stored procedures:
1. Return values ➖
A stored function can only return one value, while a stored procedure can return multiple
values or an entire result set.
➖
16
2. Supported parameters
A stored function only supports input parameters, while a stored procedure supports IN, OUT,
and INOUT parameters in any combination.
3. Execution ➖
A stored procedure is run as a unit, while a function performs several actions serially.
4. Performance ➖
Stored procedures and functions are pre-compiled and stored in the database, which means
they can be executed more quickly than ad-hoc queries. This can result in faster response
times and better overall performance of the database.
Functions can be overloaded, meaning that you can create many functions, all with the same
IL
name, as long as each function has a different set of parameters. At run time, the SQL engine
will determine which of the functions to invoke, based upon the parameters passed.
H
SA
● Database Triggers ➖
In software engineering, a database trigger is a piece of code that automatically executes
when a specific event occurs in a database. Events can be data manipulation operations, like
updates, inserts, or deletes, or system events, like logins or logouts.
Database triggers are mainly used to maintain the integrity of the information in the database.
They help maintain database integrity by preventing incorrect, unauthorised, or inconsistent
changes to data.
❖ Logon triggers: These triggers fire when the LOGON event of SQL Server is raised.
This event is raised when a user session is being established with SQL Server.
❖ Row-level triggers: These triggers are used for data retrieval operations.
IL
H
SA
➖03
18
UNIT
● Introduction to Normalisation ➖
Database normalisation is a process for organising data in a database. It involves creating
tables and establishing relationships between them. The goal is to eliminate data redundancy
and inconsistent dependency, and to make the database more flexible.
The concept of normalisation was first proposed in the 1970s by IBM researchers E.F. Codd.
IL
3. Simplify the query process
4. Improve workflow
5. Increase security
6. Lessen costs
IL
H
➖
➖
Here are some details about each normal form:
1. First normal form (1NF)
SA
The first step in normalising a table by reducing confusion and redundancy. In 1NF, you
remove redundant columns and fields, and add a primary key. A relation is in 1NF if and only if
no attribute domain has relations as elements.
● Dependency Preservation ➖
Dependency Preserving Decomposition is a technique used in Database Management System
(DBMS) to decompose a relation into smaller relations while preserving the functional
dependencies between the attributes. The goal is to improve the efficiency of the database by
reducing redundancy and improving query performance.
➖
20
● Boyce-Codd Normal Form
Boyce-Codd Normal Form (BCNF) is a higher level of database normalization that applies to
relational databases with a primary key. BCNF is a stricter form of 3NF that ensures that each
determinant in a table is a candidate key.
BCNF states that every non-key column in a table should depend on the whole primary key,
and not on any subset of it.
A table is in BCNF if every functional dependency X → Y, X is the super key of the table.
IL
2. Decompose the relation R into XA & R-{A} (R minus A).
Transformation into Boyce-Codd normal form deals with the problem of overlapping keys. An
indirect dependency is resolved by creating a new relation for each entity.
A multi-valued dependency occurs when a table contains more than one independent,
multi-valued relationship among the data. For example, a multi-valued dependency exists for a
relation A −> B when multiple values of B exist for a single value of A.
4NF is similar to the Boyce-Codd normal form, but 4NF is stronger because it exchanges
functional dependencies with multivalued dependencies.
A join dependency occurs when a table can be split into two or more smaller tables, and then
reconstructed by joining them on their primary keys, without losing any information.
IL
A relation is said to be in 5NF if and only if it satisfies 4NF and no join dependency exists. 5NF
is also known as Project-join normal form (PJ/NF).
A relation is in 5NF if it is in 4NF, and won't have lossless decomposition into smaller tables.
You can also consider that a relation is in 5NF, if the candidate key implies every join
dependency in it.
H
The concept of Join Dependency is directly based on the concept of 5NF. Similar to functional
or multivalued dependency, join dependency is a constraint.
DKNF goes beyond the Boyce-Codd Normal Form (BCNF) and addresses additional
constraints that BCNF might not capture.
A relation is in DKNF when insertion or delete anomalies are not present in the database.
DKNF ensures there are no insertion and deletion anomalies.
If a relation is in DKNF, it is already in 5NF, 4NF, 3NF, BCNF, 2NF, 1NF.
➖04
22
UNIT
● Database Recovery ➖
Database recovery in database management systems (DBMS) is the process of restoring a
database to its original state after a failure.
IL
H
Here are some related concepts:
● Concurrency Management ➖
Concurrency management in database management systems (DBMS) is the process of
managing simultaneous operations without conflict. It ensures that database transactions are
performed correctly and concurrently to produce correct results without violating data integrity.
The primary goal of concurrency control is to allow transactions to run concurrently while
maintaining the database's ACID (Atomicity, Consistency, Isolation, and Durability) properties.
2. Lock-based Protocol ➖
Applies a lock condition on the data element, which helps in restricting another resource to
perform read and write operation until the lock is active. There are mainly two types of lock
such as shared or read-only lock and exclusive lock.
4. Timestamp-based Protocol ➖
IL
Uses the System Time or Logical Counter as a timestamp to serialize the execution of
concurrent transactions.
● Database Security ➖
Database security is a set of tools, controls, and measures that protect databases from
unauthorised access, misuse, and other security breaches. The goal of database security is to
H
protect critical and confidential data from unauthorised access.
➖
Database security is based on three constructs: Confidentiality, Integrity, Availability.
Here are some best practices for database security:
1. Separate database servers and web servers
2. Encrypt data at rest and in transit
3. Use strong authentication
SA
4. Continuously discover sensitive data
5. Revoke privileges continuously
6. Deploy physical database security
7. Ensure database user accounts are secure
8. Monitor database activity
Encryption is a technique that encodes data so that only authorised users can understand it.
However, encryption alone is not enough to secure data.
Authentication is the process of proving a user's identity by entering the correct user ID and
password. Authorization allows each user to access certain data objects and perform certain
database operations.
Integrity constraints are protocols that a table's data columns must follow. They restrict the
types of information that can be entered into a table. You can apply integrity constraints at the
IL
column or table level.
There are four main types of integrity constraints: Domain, Entity, Referential, Key.
Integrity control is a control that rejects invalid data inputs, prevents unauthorised data outputs,
and protects data and programs against accidental or malicious tampering.
3. Design of distribution ➖
This can be performed top-down or bottom-up. The first approach is typical of a distributed
database developed from scratch. The second approach is typical of the development of a
multidatabase as the aggregation of existing databases.
IL
HAPPY ENDING BY SAHIL RAUNIYAR
H
SA