0% found this document useful (0 votes)
50 views126 pages

Intro 2 DB

The document provides an introduction to databases, including key concepts and applications. It discusses what a database is at three levels: the data storage, database management system (DBMS) software, and database applications. Examples are given at each level. The document also covers database types, typical database application architecture, data file structures, common acronyms like SQL, CRUD and ACID, advantages of databases over flat files, database concepts like tables, primary keys and foreign keys, and entity-relationship modeling. Contemporary database designs for big data are also introduced.

Uploaded by

israel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views126 pages

Intro 2 DB

The document provides an introduction to databases, including key concepts and applications. It discusses what a database is at three levels: the data storage, database management system (DBMS) software, and database applications. Examples are given at each level. The document also covers database types, typical database application architecture, data file structures, common acronyms like SQL, CRUD and ACID, advantages of databases over flat files, database concepts like tables, primary keys and foreign keys, and entity-relationship modeling. Contemporary database designs for big data are also introduced.

Uploaded by

israel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 126

Introduction to Databases

Presented by
Mr Pasipanodya
Introduction
• What is Database
• Key Concepts
• Typical Applications and Demo
• Lastest Trends
What is Database
• Three levels to view:
▫ Level 1: literal meaning – the place where data is stored
 Database = Data + Base, the actual storage of all the information
that are interested
▫ Level 2: Database Management System (DBMS)
 The software tool package that helps gatekeeper and manage data storage,
access and maintenances. It can be either in personal usage scope (MS Access,
SQLite) or enterprise level scope (Oracle, MySQL, MS SQL, etc).
▫ Level 3: Database Application
 All the possible applications built upon the data stored in databases (web site,
BI application, ERP etc).
Examples at each level
• Level 1: data collection
 text files in certain format: such as many bioinformatic databases
 the actual data files of databases that stored through certain
DBMS, i.e. MySQL, SQL server, Oracle, Postgresql, etc.
• Level 2: Database Management (DBMS)
 SQL Server, Oracle, MySQL, SQLite, MS Access, etc.
• Level 3: Database Application
 Web/Mobile/Desktop standalone application - e-commerce, online banking,
online registration, etc.
Examples at each level
• Level 1: data collection
 text files in certain format: such as many bioinformatic databases
 the actual data files of databases that stored through certain
DBMS, i.e. MySQL, SQL server, Oracle, Postgresql, etc.
• Level 2: Database Management System (DBMS)
 SQL Server, Oracle, MySQL, SQLite, MS Access, etc.
• Level 3: Database Application
 Web/Mobile/Desktop standalone application - e-commerce, online banking,
online registration, Wikipedia, etc.
Database Types
• Flat Model
• Navigational databases
▫ Hierarchical (tree) database model
▫ Network/Graph model
• Relational Model
• Object model
• Document model
• Entity–attribute–value model
• Star schema
Typical Database Application Architecture

app

DBMS DB
Data File Structure

• Demo :
▫ Take a look at the following file directories:

 MySQL -- C:\ProgramData\MySQL\MySQL Server 8.0\Data\

 Access -- C:\ARCS_dbtutorial\db\access\

 Postgresql -- /project/scv/examples/db/tutorial/data/postgresql/testdb/
Data File Structure - MySQL
Data File Structure - Access
Data File Structure - PostgreSQL
ATTENTION

!!!
NO database files can be accessed directly,
but only through the database engine, called “DBMS”
Typical Database Application Architecture

app

DBMS DB
Three Common Acronyms
• SQL – Structured Query Language

• CRUD – Create, Read, Update, Delete


• ACID – Atomicity, Concurrency, Integrity and Durability (transaction)
Disadvantage of conventional flat file
▫ Redundancy - same data may store in many different copies
▫ Inconsistency – data regarding to same business entity may appear in
different forms, for example, state name, phone number, etc. This made it
hard to modify data and keep clean track of change; even loss of data
▫ Mixture of all data together – not clear in logical relationships between the
data columns, thus hard to understand and manage once the data
structure gets complex
▫ Hard to maintain and manage
▫ No concurrency support (only one can operate on the file)
First Acronym - ACID
Atomicity – transactions are either all or none (commit/rollback)
Consistency – only valid data is saved
Isolation – transactions would not affect each other
Durability – written data will not be lost

***Good example : bank transaction***

Most of challenges of ACID compliance come from multiple users/concurrent


using of database
How Databases solves the problem?
• Self-describing data collection of related records (meta data, data about
data) , detail explanation as below:
- Self-describing means:
▫ Database not just contains data, but also contains definition of the
structure of data, that can be considered ‘Meta data’; It includes many
related info : table column definition, index and key info, constraints, etc
▫ At database application level, database can also store other application
related meta data as well, it makes personalization and customization of
the application according to user profile much easier to handle. The typical
example could be the user preference for those common social media
sites or e-commerce sites, etc.
Database Content
Typical Database • User Data: tables to store user data

• Meta data: keep the structure (schema) of


the data, including table name, column
name and type and contraints over the
column(s)
• User data
• Medadata
• Application metadata
• Application meta data: application specific
• Index and other
meta data regarding to user settings or
overhead
functions of the application

• Index and other overhead data: used for


improving performance and maintenance,
such as logs, track, security, etc.
Terminology and Concept – Tables (Relations)
The very central concepts of relational databases are Tables (Relations),
Relationships.

Table (formally called ‘relation’) – is the building block of relational database. It


stores data in 2D, with its row reflects one instance of record (tuple), and each
of its column reflects one aspect of the attributes of all instances, column may
also be called ‘field’.

For example, A ‘student’ table may contains (student id, first name, last name,
grade, school name, home address, …), and each row may represent one
student’s information, and each column of the table represents one piece of
information of all students. And this is called a ‘relation’.
Primary Key and Foreign Key
• Primary key: Unique Identifier made of one or more columns to
uniquely identify rows in a table. If the primary key contains more than
one column, it can be called ‘composite key’ as well.
• Foreign Key: is the primary key of another table, which is referenced in
the current table. It’s the key to establish the relationship between the
two tables, and through DBMS, to impose referential integrity.
Surrogate Key
• Surrogate key is a unique column
• added to a relation to use as the primary key when lack of natural
column serves as primary key, or when composite key needs to be
replaced for various reasons.
• Surrogate key is usually in form of auto increment numeric value, and
of no meaning to the user, and thus cd often hidden in the table, or
form or other entity for the internal use.
• Surrogate keys are often used in the place of composite key to add more
flexibility to the table.
Terminology and Concept – E-R model
E-R Model: Entity-Relationship data model is the common technique used
in database design. It captures the relationships between database tables
and represent them in a graphical way. The relationships between two
entities can be 1:1, 1:N, or M:N. And it is usually established through
‘foreign key’ constraint.

Examples:
1:1 Employee – Locker
1:N Customer – Order, Order – Order Detail
M:N Student – Course
Research Computing

Contemporary database designs:Big Data


• 4Vs:
▫ Volume – how big in storage is need?
▫ Variety – how diverse is the data ?
▫ Veracity – can data be verified/trusted?
▫ Velocity – how fast is the data being generated?
Research Computing

Big Data Technologies


• Predictive analytics: to discover, evaluate, optimize, and deploy predictive models by analyzing big data sources to improve business
performance or mitigate risk.
• NoSQL databases: key-value, document, and graph databases.
• Search and knowledge discovery: tools and technologies to support self-service extraction of information and new insights from large
repositories of unstructured and structured data that resides in multiple sources such as file systems, databases, streams, APIs, and
other platforms and applications.
• Stream analytics: software that can filter, aggregate, enrich, and analyze a high throughput of data from multiple disparate live data
sources and in any data format.
• In-memory data fabric: provides low-latency access and processing of large quantities of data by distributing data across the dynamic
random access memory (DRAM), Flash, or SSD of a distributed computer system.
• Distributed file stores: a computer network where data is stored on more than one node, often in a replicated fashion, for redundancy
and performance.
• Data virtualization: a technology that delivers information from various data sources, including big data sources such as Hadoop and
distributed data stores in real-time and near-real time.
• Data integration: tools for data orchestration across solutions such as Amazon Elastic MapReduce (EMR), Apache Hive, Apache Pig,
Apache Spark, MapReduce, Couchbase, Hadoop, and MongoDB.
• Data preparation: software that eases the burden of sourcing, shaping, cleansing, and sharing diverse and messy data sets to
accelerate data’s usefulness for analytics.
• Data quality: products that conduct data cleansing and enrichment on large, high-velocity data sets, using parallel operations on
distributed data stores and databases.
File-based Systems
Agenda
History of Database
Database Management Systems (DBMS)
File-based Definition

Program defines and manages it’s own data


Limitations of File-
based
Separation and isolation
Duplication
Program & data dependence
Fixed queries
Proliferation of application programs
History of Database
Systems
First generation
Hierarchical model
Information Management System (IMS)
Network model
Conference on Data System Languages (CODASYL)
Data Base Task Group (DBTG)
Limitation
Complex program for simple query
Minimum data independence
No theoretical foundation
Second generation
Relational model
E. R. Codd
DB2, Oracle
Limitation
Limited data modeling
Third generation
Object-relational DBMS
Object-oriented DBMS
Database

Definition
A collection of self-describing and integrated data files
System catalog
Meta data
Data dictionary
Overhead data
Data abstraction
Database
Management System
Facility Data definition language (DDL)
Data manipulation language (DML)
Structured query language (SQL)
Security system
Integrity system
Concurrency control system
Backup & recovery system
View mechanism
DBMS Environment

Hardware
Client-server architecture
Software
dbms, os, network, application
Data
Schema, subschema, table, attribute
People
Data administrator & database administrator
Database designer: logical & physical
Application programmer
End-user: naive & sophisticated
Procedure
Start, stop, log on, log off, back up, recovery
Advantages of DBMS

Control redundancy
Consistency
Integrity
Security
Concurrency control
Backup & recovery
Data standard
More information
Data sharing & conflict control
Productivity & accessibility
Economy of scale
Maintenance
Limitations of DBMS

Complexity
Size
Cost
Software
Hardware
Conversion
Performance
Vulnerability
File-based Systems
Points to Remember
History of Database
Database Management Systems (DBMS)
Research Computing

Summary of Training
• List important points from each lesson.
• Provide resources for more information on subject.
▫ List resources on this slide.
▫ Provide handouts with additional resource material.
Database Environment

2022 It203 week2 36


Learning Objectives
• Purpose of three-level database architecture.
• Contents of external, conceptual, and internal
levels.
• Purpose of external/conceptual and
conceptual/internal mappings.
• Meaning of logical and physical data
independence.
• Distinction between DDL and DML.
• A classification of data models.
37
Learning Objectives
• Purpose/importance of conceptual modeling.
• Typical functions and services a DBMS should
provide.
• Software components of a DBMS.
• Function and importance of the system catalog.

38
Objectives of Three-Level
Architecture
• All users should be able to access same
data.

• A user’s view is immune to changes made


in other views.

• Users should not need to know physical


database storage details.

39
Objectives of Three-Level
Architecture
• DBA should be able to change database storage
structures without affecting the users’ views.

• Internal structure of database should be


unaffected by changes to physical aspects of
storage.

• DBA should be able to change conceptual


structure of database without affecting all users.

40
ANSI-SPARC
Three-Level Architecture

41
ANSI-SPARC Three-Level
Architecture
• External Level
– Users’ view of the database.
– Describes that part of database that is relevant to a
particular user.

• Conceptual Level
– Community view of the database.
– Describes WHAT data is stored in database and
relationships among the data.

42
ANSI-SPARC Three-Level
Architecture
• Internal Level
– Physical representation of the database on
the computer.
– Describes HOW the data is stored in the
database.

43
Differences between Three Levels
of ANSI-SPARC Architecture

44
Data Independence
• Logical Data Independence
– Refers to immunity of external schemas to
changes in conceptual schema.
– Conceptual schema changes (e.g.
addition/removal of entities)
should not require changes to external
schema or rewrites of application programs.

45
Data Independence
• Physical Data Independence
– Refers to immunity of conceptual schema to
changes in the internal schema.
– Internal schema changes (e.g. using different
file organizations, storage structures/devices)
should not require change to conceptual or
external schemas.

46
Data Independence and the
ANSI-SPARC Three-Level
Architecture

47
Database Languages
• Data Definition Language (DDL)
– Allows the DBA or user to describe and
name entities, attributes, and relationships
required for the application
– plus any associated integrity and security
constraints.

48
Database Languages
• Data Manipulation Language (DML)
– Provides basic data manipulation operations on
data held in the database.
• Procedural DML
– allows user to tell system exactly how to manipulate
data.
• Non-Procedural DML
– allows user to state what data is needed rather than
how it is to be retrieved.

49
Database Languages
• Fourth Generation Language (4GL)
Ex: Microsoft (MS) Access 2010
– Query Languages
– Forms Generators
– Report Generators
– Graphics Generators
– Application Generators.

50
Data Model
Integrated collection of concepts for describing
data, relationships between data, and
constraints on the data in an organization.

• Data Model comprises:


– a structural part;
– a manipulative part;
– possibly a set of integrity rules.

51
Data Model
• Purpose
– To represent data in an understandable way.

• Categories of data models include:


– Object-based
– Record-based
– Physical.

52
Data Models
• Object-Based Data Models
– Entity-Relationship
– Semantic
– Functional
– Object-Oriented.
• Record-Based Data Models
– Relational Data Model
– Network Data Model
– Hierarchical Data Model.
• Physical Data Models
53
Conceptual Modeling
• Conceptual schema is the core of a system
supporting all user views.
• Should be complete and accurate representation
of an organization’s data requirements.
• Conceptual modelling is process of developing a
model of information use that is independent of
implementation details.
• Result is a conceptual data model.

54
Functions of a DBMS
• Data Storage, Retrieval, and Update.

• A User-Accessible Catalog.

• Transaction Support.

• Concurrency Control Services.

• Recovery Services.

55
Functions of a DBMS
• Authorization Services.

• Support for Data Communication.

• Integrity Services.

• Services to Promote Data Independence.

• Utility Services.

56
Components of a DBMS

57
Components of Database
Manager (DM)

58
System Catalog
• Repository of information (metadata)
describing the data in the database.
• Typically stores:
– names of authorized users;
– names of data items in the database;
– constraints on each data item;
– data items accessible by a user and the type of access.
• Used by modules such as Authorization
Control and Integrity Checker.

59
Web Database Environment

60
Learning Objectives
• Meaning of client–server architecture and advantages
of this type of architecture for a DBMS.
• The difference between two-tier, three-tier and n-tier
client–server architectures
• The function of an application server
• The meaning of middleware and the different types
of middleware that exist
• Function and uses of Transaction Processing Monitors.

61
Learning Objectives
• The purpose of a Web service and the
technological standards used
• The meaning of service-oriented
architecture (SOA)
• The difference between distributed DBMSs,
and distributed processing

62
Acknowledgments
• Some of these slides have been adapted from
Thomas Connolly and Carolyn Begg

63
Multi-user DBMS Architectures
• Teleprocessing
– Traditional architecture for multi-user systems
– One computer with a single central processing unit
(CPU) and a number of terminals
– Put a huge burden on the central computer
• Downsizing
– Replacing expensive mainframe computers with
more cost-effective networks of personal
computers
64
Teleprocessing Topology

65
Multi-user DBMS Architectures
• File-server architecture
– Processing is distributed about the network
– Three main disadvantages
• Large amount of network traffic
• Full copy of DBMS required on each workstation
• Concurrency, recovery, and integrity control are
complex
– Multiple DBMSs can access the same files

66
File-Server Architecture

67
Multi-user DBMS Architectures
• Traditional two-tier client–server architecture
– Client process requires some resource
– Server provides the resource
– Basic separation of four main components of
business application
– Typical interaction between client and server

68
Client-Server Architecture

69
Alternative Client-Server Topologies

70
Summary of client–server functions

71
Multi-user DBMS Architectures
• Three-tier client–server architecture
– User interface layer
– Business logic and data processing layer
– DBMS
– Many advantages over traditional two-tier or
single-tier designs

72
Multi-user DBMS Architectures
• N-tier architectures
– Three-tier architecture can be expanded to n tiers
• Application servers
– Hosts an application programming interface (API)
to expose business logic and business processes for
use by other applications

73
Multi-user DBMS Architectures
• Middleware
– Software that mediates with other software
– Communication among disparate applications
– Six main types
• Asynchronous Remote Procedure Call (RPC)
• Synchronous RPC
• Publish/Subscribe
• Message-Oriented middleware (MOM)
• Object-request broker (ORB)
• SQL-oriented data access
74
Multi-user DBMS Architectures
• Transaction processing monitor
– Controls data transfer between clients/servers
– Provides a consistent environment, particularly for
online transaction processing (OLTP)
– Significant advantages
• Transaction routing
• Managing distributed transactions
• Load balancing
• Funneling
• Increased reliability 75
Multi-user DBMS Architectures
Transaction processing monitor of a three-tier client-server architecture

76
Web Services and Service-Oriented
Architectures
• Web service
– Software system that supports interoperable
machine-to-machine interaction over a network
– No user interface
– Examples of Web services
– Uses widely accepted technologies and standards

77
Relationship between WSDL,
UDDI, and SOAP

78
Web Services and Service-Oriented
Architectures
• Service-Oriented Architectures (SOA)
– Architecture for building applications that
implement business processes as sets of services
– Published at a granularity relevant to the service
consumer
– Loosely coupled and autonomous services
– Web services designed for SOA different from
other Web services

79
Traditional vs. SOA Architecture

80
Distributed DBMSs
• Distributed database
– Logically interrelated collection of shared data
physically distributed over a computer network
• Distributed DBMS
– Software system that permits the management of
the distributed database
– Makes the distribution transparent to users

81
Distributed DBMSs
• Characteristics of DDBMS
– Collection of logically related shared data
– Data split into fragments
– Fragments may be replicated
– Fragments/replicas are allocated to sites
– Sites are linked by a communications network
– Data at each site is controlled by DBMS
– DMBS handles local apps autonomously
– Each DBMS in one or more global app
82
Distributed DBMSs
• Distributed processing
– Centralized database that can be accessed over a
computer network
• System consists of data that is physically
distributed across a number of sites in the
network

83
The Relational Database Model

84
Learning Objectives
• Terminology of relational model.
• How tables are used to represent data.
• Connection between mathematical relations and
relations in the relational model.
• Properties of database relations.
• How to identify candidate, primary, and foreign
keys.
• Meaning of entity integrity and referential integrity.
• Purpose and advantages of views.

85
History of the Relational Model
• Relational Database Model history
– Proposed by Codd in 1970
– Pioneer projects such as at IBM and UC-Berkeley in
mid-1970s
– Today, still the dominant database model:
• IBM DB2, ORACLE, INFORMIX, SYBASE
• MICROSOFT Access, SQL Server
• FOXBASE, PARADOX
• …
• The relational model provides a logical
representation of the data

86
Relational Model Terminology
• A relation is a table with columns and rows.
– Only applies to logical structure of the database, not the physical
structure.
– A relation corresponds to an entity set, or collection of entities. An
entity is a person, place, event, or thing about which data is collected

• Attribute is a named column of a relation. It corresponds to a


characteristic of an entity. They are also called fields.

• Domain is the set of allowable values for one or more


attributes.

87
Relational Model Terminology
• Tuple is a row of a relation.

• Degree is the number of attributes in a relation.

• Cardinality is the number of tuples in a relation.

• Relational Database is a collection of normalized


relations with distinct relation names.

• NB a relation is not a relationship, but an entity


set.
88
Instances of Branch and Staff
(part) Relations

89
Examples of Attribute Domains

90
Alternative Terminology for
Relational Model

91
Mathematical Definition of
Relation
• Consider two sets, D1 & D2, where D1 = {2, 4} and D2
= {1, 3, 5}.
• Cartesian product, D1 ´ D2, is set of all ordered
pairs, where first element is member of D1 and
second element is member of D2.

D1 ´ D2 = {(2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)}

• Alternative way is to find all combinations of


elements with first from D1 and second from D2.
92
Mathematical Definition of
Relation
• Any subset of Cartesian product is a relation; e.g.
R = {(2, 1), (4, 1)}
• May specify which pairs are in relation using
some condition for selection; e.g.
– second element is 1:
R = {(x, y) | x ÎD1, y ÎD2, and y = 1}
– first element is always twice the second:
S = {(x, y) | x ÎD1, y ÎD2, and x = 2y}

93
Mathematical Definition of
Relation
• Consider three sets D1, D2, D3 with Cartesian
Product D1 ´ D2 ´ D3; e.g.

D1 = {1, 3} D2 = {2, 4} D3 = {5, 6}


D1 ´ D2 ´ D3 = {(1,2,5), (1,2,6), (1,4,5), (1,4,6), (3,2,5),
(3,2,6), (3,4,5), (3,4,6)}

• Any subset of these ordered triples is a relation.

94
Mathematical Definition of
Relation
• The Cartesian product of n sets (D1, D2, . . ., Dn) is:

D1 ´ D2 ´ . . . ´ Dn = {(d1, d2, . . . , dn) | d1 ÎD1, d2 ÎD2, . . . , dnÎDn}

usually written as:


n
XDi
i=1

• Any set of n-tuples from this Cartesian product is a


relation on the n sets.

95
Database Relations
• Relation schema
– Named relation defined by a set of attribute
and domain name pairs.

• Relational database schema


– Set of relation schemas, each with a distinct
name.

96
Properties of Relations
• Relation name is distinct from all other relation
names in relational schema.

• Each cell of relation contains exactly one atomic


(single) value.

• Each attribute has a distinct name.

• Values of an attribute are all from the same


domain.
97
Properties of Relations
• Each tuple is distinct; there are no
duplicate tuples.

• Order of attributes has no significance.

• Order of tuples has no significance,


theoretically.

98
Table Characteristics
• Each RDBMS has its rules for table and column names.
Example: Access
Table names <= 64 (8 is classical)
Column names <= 64 (10 is classical)
Column names cannot start with digit,
or contain special characters
except underscore and a few others
• Each RDBMS has its rules for associating a data type to
an attribute, but there are classical ones:
text, character, number, date, boolean

99
Table Characteristics

100
Relational Keys
• Key
– One or more attributes that determine other
attributes
• Key attribute
• Composite key
• There needs to be full functional
dependence from key to any other attribute

101
Relational Keys
• Keys may be
– Single
– Composite (composed of several key attributes)
• Example: staff_fName, staff_lName, staff_init,
staff_phone  staff_DOB, staff_position
• Functional dependence: attribute A2 is
functionally dependent on a composite key A1,
but not on any subset of it

102
Relational Keys
• Functional dependence: an attribute A is
functionally dependent on an attribute K is
each value in column K determines one and
only one value in column A. K  A (K
determines A).
• Attribute K determines attribute A if all
rows in the table that agree in value for
attribute K must also agree in value for
attribute A.
• Attribute A is functionally dependent on K
if K determines A.
103
Relational Keys
• Superkey
– An attribute, or a set of attributes, that uniquely
identifies a tuple within a relation.

• Candidate Key
– Superkey (K) such that no proper subset is a superkey
within the relation.
– In each tuple of R, values of K uniquely identify that
tuple (uniqueness).
– No proper subset of K has the uniqueness property
(irreducibility).
104
Relational Keys
• Primary Key
– Candidate key selected to identify tuples uniquely
within relation.

• Alternate Keys
– Candidate keys that are not selected to be primary
key.

• Foreign Key
– Attribute, or set of attributes, within one relation
that matches candidate key of some (possibly same)
relation. 105
Relational Integrity
• Null
– Represents value for an attribute that is
currently unknown or not applicable for tuple.
– Deals with incomplete or exceptional data.
– Represents the absence of a value and is not the
same as zero or spaces, which are values.

106
Relational Integrity
• Entity Integrity
– In a base relation, no attribute of a primary key can
be null.
– Ensures that all entities are unique.

• Referential Integrity
– If foreign key exists in a relation, either foreign key
value must match a candidate key value of some
tuple in its home relation or foreign key value must
be wholly null.

107
Relational Integrity
• Enterprise Constraints
– Additional rules specified by users or
database administrators.

108
Relational Database Operators
• Relational algebra determines
table manipulations
• Key operators (minimally relational RDBMS)
– SELECT
– PROJECT
– JOIN
• Other operators
– INTERSECT
– UNION (union compatible tables)
– DIFFERENCE
– PRODUCT
– DIVIDE
109
Union
Combines all rows

110
Intersect
Yields rows that appear in both tables

Figure 2.6

111
Difference
Yields rows not found in other tables

Figure 2.7

112
Product
Yields all possible pairs from two tables

Figure 2.8

113
Select
Yields a subset of rows based on specified criterion

114
Project
Yields all values for selected attributes

Figure 2.10
115
Join
Information from two or more tables is combined

Figure 2.11

116
Natural Join Process
• Links tables by selecting rows with
common values in common attribute(s)
• Three-stage process
– Product creates one table
– Select yields appropriate rows
– Project yields single copy of each attribute to
eliminate duplicate columns
• Eliminates duplicates
• Does not include rows that are unmatched

117
Other Joins
• EquiJOIN
– Links tables based on equality condition that compares
specified columns of tables
– Does not eliminate duplicate columns
– Join criteria must be explicitly defined
• Theta JOIN
– EquiJOIN that compares specified columns of each
table using operator other than equality one
• Outer JOIN
– Matched pairs are retained
– Unmatched values in other tables left null
– Right and left
118
Other Joins

119
Divide
Requires user of single-column table and two-column table
A value in the unshared column must be associated with each
value in the single-column table

Figure 2.17

120
Views
• Base Relation
– Named relation corresponding to an entity
in conceptual schema, whose tuples are
physically stored in database.

• View
– Dynamic result of one or more relational
operations operating on base relations to
produce another relation.

121
Views
• A virtual relation that does not necessarily
actually exist in the database but is produced
upon request, at time of request.

• Contents of a view are defined as a query on one


or more base relations.

• Views are dynamic, meaning that changes made


to base relations that affect view attributes are
immediately reflected in the view.

122
Purpose of Views
• Provides powerful and flexible security
mechanism by hiding parts of database from
certain users.

• Permits users to access data in a customized


way, so that same data can be seen by different
users in different ways, at same time.

• Can simplify complex operations on base


relations.
123
Updating Views
• All updates to a base relation should be
immediately reflected in all views that
reference that base relation.

• If view is updated, underlying base


relation should reflect change.

124
Updating Views
• There are restrictions on types of
modifications that can be made through
views:
- Updates are allowed if query involves a
single base relation and contains a
candidate key of base relation.
- Updates are not allowed involving multiple
base relations.
- Updates are not allowed involving
aggregation or grouping operations.
125
Updating Views
• Classes of views are defined as:
– theoretically not updateable;
– theoretically updateable;
– partially updateable.

126

You might also like