DBMS 1 1
DBMS 1 1
Introduction
1.1 History of Database Systems
Storing and manipulating of data have been a major application focus in the earliest days
ofcomputers.
Integrated Data Store: The First general-purpose Database Management System (DBMS) was
designed by Charles Bachman at General Electric in the early 1960s.
Programs owning data come with many programming bottlenecks and the most outstanding
ones to are:-
Data: means raw facts or raw data that can be recorded and that have implicit meaning. It is the
bedrock of knowledge.
A database of even greater size and complexity is maintained by the Federal Inland Revenue
Service (FIRS) to keep track of the tax forms filed by Nigerian taxpayers.
i.A database represents some aspect of the real world, sometimes called the miniworld
or the Universe of Discourse (UoD). Changes to the miniworld are reflected in the
database.
ii.A database is a logically coherent collection of data with some inherent meaning. A
iii.A database is designed, built, and populated with data for a specific purpose. It has an
intended group of users and some preconceived applications in which these users are
interested.
Database Management Systems (DBMS): are software or program products that help in
defining, creating, maintaining and controlling access to a database. The DBMS is hence a
general purpose software system that facilitates the processes of defining, constructing, and
manipulating databases for various applications.
Note: A DBMS is a complex software system that consists of many components or modules,
including modules for implementing the catalog, query language, interface processors, data
access, concurrency control, recovery, and security.
structures
Constructing the database is the process of storing the data itself on some storage medium
that is controlled by the DBMS.
Manipulating a database includes such functions as querying the database to retrieve specific
data, updating the database to reflect changes in the miniworld, and generating reports from the
data.
production of accurate
relevant
timely information.
Types of database
1. Number of users:
I. A single-user database: it supports only one user at a time. In other words, if user A
is using the database, users B and C must wait until user A is done.
2. Location:
II. Distributed (decentralized) database: A database that supports data distributed across
several different sites.
3. A cloud database: is a database that is created and maintained using cloud data services,
such as Microsoft Azure or Amazon’s AWS.
infrastructure:
hardware
operating systems
Examples:
census database
LexisNexis database
ProQuest database
II. Discipline-specific databases: contain data focused on specific subject areas. The data in
this type of database are used mainly for academic or research purposes within a small
set of disciplines.
Examples:
Medical database
CompuStat database
CRSP database
5. How they will be used and on the time sensitivity of the information gathered from them:
The most popular way of classifying databases today.
Other names:
Transactional database
Production database.
Examples:
payments
today operations.
II. Analytical database: focuses primarily on storing historical data and business metrics
used exclusively for tactical or strategic decision making. Such analysis typically requires
extensive “data massaging” (data manipulation) to produce information on which to base
pricing decisions, sales forecasts, market strategies, and so on.
Analytical databases allow the end user to perform advanced data analysis of business
data using sophisticated tools.
Operational databases
External sources
b. Online analytical processing (OLAP): is a set of tools that work together to provide
an advanced data analysis environment for retrieving, processing, and modeling
data from the data warehouse.
It describes a comprehensive approach to capture and process business data with the purpose
of generating information to support business decision making.
I. Unstructured data: are data that exist in their original (raw) state—that is, in the format in
which they were collected. Therefore, unstructured data exist in a format that does not
lend itself to the processing that yields information.
II. Structured data: are the result of formatting unstructured data to facilitate storage, use,
and the generation of information.
III. Semi-structured data: have already been processed to some extent. Actually, most data
you encounter are best classified as semi-structured.
Conclusion: The database types mentioned thus far focus on the storage and management
of highly structured data. However, corporations are not limited to the use of structured data.
Unstructured and semi-structured data storage and management needs are being addressed
through a new generation of databases known as XML databases.
7. An XML database: supports the storage and management of semi-structured XML data.
Extensible Markup Language (XML) is a special language used to represent and manipulate
data elements in a textual format.
Social media refers to web and mobile technologies that enable “anywhere, anytime, always
on” human interactions.
8. NoSQL databases: are designed to handle the unprecedented volume of data, variety of data
types and structures, and velocity ofdata operations that are characteristic of these new
business requirements.
Note: The term NoSQL (Not only SQL) is generally used to describe a new generation of
database management systems that is not based on the traditional relational database model.
Other names:
Specialized database
File processing approach: is a traditional approach to information system design focuses on the
data processing needs of a particular department or area within an organization.
It includes:
Manual systems
Accounts Receivable
Payroll
Inventory Control.
The file processingis being developed with third generation programming language where each
application or program is designed with its own set of data files. Traditional files often existing
environments where there is no overall plan to guide the growth of applications.
i.Data redundancy: refers to the duplication of data. Data redundancy often leads to higher
storage costs and poor access time.
ii.Data redundancy leads to data inconsistency: change of one state of a data while the other
duplicate is neglected. There is disagreement on the stored values
iii.Data Isolation: because data are scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult.
v.Atomicity issues: Atomicity of a transaction refers to “All or nothing”, which means either all
the operations in a transaction executes or none. E.g transactions.It is difficult to achieve
atomicity in file processing systems.
vi.Data Security: data should be secured from unauthorized access.Such kind of security
constraints are difficult to apply in file processing systems.
The database approach emphasizes the integration and sharing of data across the organization.
There are several advantages of Database management system over file system, which
include:
i.Minimal Data Redundancy: With the database approach, data files are integrated into a single,
logical structure. We are not suggesting all redundancy is controlled. It is designed into the
system to improve performances (or provide some other benefits), and the system is (or should
be) aware of redundancy.
ii.Consistency of Data: By controlling data redundancy, we greatly reduce the opportunities for
inconsistency. For example, if each address is stored only once, we cannot have disagreement
on the stored values. When controlled redundancy is permitted in the database, the database
system itself should enforce consistency by updating each occurrence of a data item when a
change occurs.
iii.Integration of Data: In a database, data are organized into a single, logical structure, with
logical relationships defined between associated data entities. This makes it easy for users to
relate one item of data to another.
iv.Sharing of Data: Most database systems today permit multiple users to share a database
concurrently, although certain restrictions are imposed such that each user would be able to
view a subset of the conceptual database model.
v.Ease of Application Development: it greatly reduces the cost and time for developing new
business applications as programmer is relieved from the burden of designing, building, and
maintaining master files.
Note: In a database system, data are independent of the application programs that use them.
Within limits, either data or the application programs that use the data can be changed without
necessitating a change.
i.New, Specialized Personnel: Frequently, organizations that adopt the database approach or
purchase a database management system (DBMS) need to hire train individuals to maintain the
new database software, develop and enforce new programming standards, design databases to
achieve the highest possible performance, and manage the staff of new people.
ii.Need for Explicit Backup: Minimal data redundancy, with all its associated benefits, may also
fail to provide backup copies of data. Such backup or independently produced copies are helpful
in restoring damaged files and in providing validity checks on crucial data.
iii.Interference with Shared Data: The concurrent access to shared data via several application
programs can lead to some problems:
a. First, when two concurrent users both want to change the same or related data,
inaccurate results can occur if access to the data is not properly synchronized.
b. Second, when data are used exclusively for updating, different users can obtain control of
different segments of the database and lock up any use of the data (so called deadlock).
Experience has shown that conflicts on how to define data, data length and coding, rights to
update shared data, and associated issues are frequent and difficult managerial issues to
resolve.
For a small personal database, one person typically defines, constructs, and manipulates the
database. However, many persons are involved in the design, use, and maintenance of a large
database with a few hundred users.
Actors on the scene: The people whose jobs involve the day-to-day use of a large database.
i.Database Administrators:In any organization where many persons use the same
resources, there is a need for a chief administrator to oversee and manage these
resources.
authorizing access to the database, for coordinating and monitoring its use, and for
acquiring software and hardware resources as needed.
The DBA is accountable for problems such as breach of security or poor system
response time.
ii.Database Designers: Database designers are responsible for identifying the data to be stored
in the database and for choosing appropriate structures to represent and store this data. These
tasks are mostly undertaken before the database is actually implemented and populated with
data.
iii.End Users: End users are the people whose jobs require access to the database for
querying, updating, and generating reports; the database primarily exists for their use.
System analysts: determine the requirements of end users, especially naive and
parametric end users, and develop specifications for canned transactions that meet
these requirements.
Workers behind the scene: those who work to maintain the database system environment, but
who are not actively interested in the database itself.
Note: In addition to those who design, use, and administer a database, others are associated
with the design, development, and operation of the DBMS software and system environment.
i.DBMS system designers and implementers: are persons who design and implement the
DBMS modules and interfaces as a software package.
The DBMS must interface with other system software, such as the operating system and
compilers for various programming languages.
ii.Tool developers: include persons who design and implement tools, the software packages
that facilitate database system design and use, and help improve performance.
Note: Tools are optional packages that are often purchased separately. They include packages
for database design, performance monitoring, natural language or graphical interfaces,
prototyping, simulation, and test data generation.
iii.Operators and maintenance personnel: are the system administration personnel who are
responsible for the actual running and maintenance of the hardware and software environment
for the database system.
1. System developers: who are resposible for the development of the database system.
2. Data administors: who are resposible for general administration and access control of
the database systems.
3. End users: who are just using fishinised database system for their operational activties
of their respective organisations or their personal business use.
4. The user interface component: is used to interface database users including system
developer, data administors and end users with DBMS, database, application programs
and even database repository.
5. Repository: Centralized knowledge base containing all data definitions, screen and
report formats and definitions of other organizations and system components.
iii.Overhead for providing security, concurrency control, recovery, and integrity functions.
it may be more desirable to use regular files under the following circumstances:
i.The database and applications are simple, well defined, and not expected to change.
ii.There are stringent real-time requirements for some programs that may not be met
The primary function of a database: is to provide timely and reliable information that supports
the daily operations of an organization.
Problem space of the database: the business functions that the database is designed to
address.
A data model: is a conceptual description of the database. It does not include any reference to
the physical layout, or structure, of the database itself.
Database schema or simply the schema: is the physical layout of the database, which
describes how data are organized and stored in the database.
A schema: is in simple terms, the translation of a conceptual data model into a physical
representation that can be implemented using a computer.
Note: Practically speaking, the schema is still a conceptual rather than a physical construct
because it is simply a description of the database, including its tables and the relationsamong
them.
A database engine (database server):is a collection of computer programs that manipulate the
data contained in a database.
Examples:
It serves as the interface between the data in the database, as described by the schema and
application software tools that process the data.
Note: In database implementation projects, the development of the data model and the
database schema is part of a design process. The actual construction of the database starts
when the database administrator implements the schema by instructing the database engine to
prepare the physical space where the database will reside on the computer’s hard drive.
Database contains:
1. Data files.
3. A set of database integrity rules: that must be enforced in order to protect the data
maintained in the database.
4. Defined views: which are logical extractions of specific aspects of the physical database.
5. Stored procedures: which are blocks of Structured Query Language (SQL) code for
defining, managing and querying data in the database.
The term “database” does not include:
1. the user interface (UI) / (database front-end): by which users’ access and interact with
the database.
2. Application programs: that are used to process and analyze data in the database.
2. User interfaces
3. application programs
4. Middleware
Some authors, notably Date (2004) and Elmasri and Navathe (2003), expand the definition of
database systems to include people as an integral component.
2. End users: who use the system for business decision making and operations.
3. Software developers: who design and construct software tools for the application of the
database system.
i. We can separate the development of a database: that is, specification and creation of a
schema to define data in a database –from the user processes that make use of the database.
ii. We can use the three schema architecture as a basis for distinguishing the activities
associated with a schema.
iii. We can represent the constraints to enforce the semantics of the data once within a
database, rather than within every user process that uses the data.
1. Planning: The goal of the planning phase is to align information technology with the business
strategies of an organization.
I. Breaking the functions of an organization down into progressively lower levels of details.
II. Identifying the entity types and the relationship between them (a draft overall E-R).
2. Analysis: The purpose of this phase is to develop detailed specifications for the information
systems required to support the organization.
At the end of this phase: aconceptual data model (a detailed E-R data model) will be built.
II. Relationships
III. Attributes
3. Design: The purpose of this stage is to transform the conceptual data model to an
implementation model that a particular DBMS can process with performance that is acceptable
to all users throughout the organization.
4. Logical database design: is the process of mapping conceptual data model to structures that
are specific to the target DBMS.
For example: if the target environment is a relational DBMS, then the conceptual data models (E-
R data model) are mapped to normalized relations.
5. Physical database design: is the process of mapping the database structures from logical
design into physical storage structures such as files and tables. Indexes, access methodsare
also specified.
Objectives of physical design: are to provide adequate performances for user applications
interms of response times, throughout rates etc.
It includes:
Documentation
Training