0% found this document useful (0 votes)
47 views61 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views61 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

2.

THE RELATIONAL MODEL


2.1 Relational Model

 Relational Model

 A relational model is one of the database models that use tables to store the data in a simple way. Its
simplicity is due to the usage of tables in which an entity is represented as a table and its instance by
the rows of the table (tuples).
 Example: A student’s entity is represented as table whereas an individual student corresponds to the
rows in the table. A relation in a relational model consists of

o A relation schema (i.e. heading of the column)


o A relation instance (i.e. table)

 While describing a relation, a relation schema is defined first followed by the relation instance.

 Relation Schema

 A relation schema contains the basic information of a table or relation. This information includes the
name of the entire table, the names of the column and the data types associated with each column.
 For Example: A relation schema for the relation called students could be expressed using the following
representation.
 Students (sid : string, name: string, login : string, age : integer, gpa : real)
 The name of the entire table or relation is “students”. There are five column names: sid, name, login,
age, gpa with respective data types: string, string, string, integer, real associated with them.

 Relation Instance
 A relation instance is a set of rows that when combined together forms the schema of the relation. A
relation instance can be thought of as a table in which each tuple is a row, and all rows have the same
number of fields.
 Relational Database Schema
 A relational database schema is a collection of relation schemas, describing one or more relations.

 Domain
 Domain is synonymous with data type. Attributes can be thought of as columns in a table. Therefore,
an attribute domain refers to the database associated with a column.

 Relation Cardinality
 The relation cardinality is the number of tuples in the relation.

 Relation Degree
 The relation degree is the number of fields in the relation.
 Tuples / Records
 The rows of the table are also known as records or tuples.

 Field/Attributes
 The columns of the table is also known as fields or attributes.

2.2 The process of creating and modifying relations using SQL

 Creating Relations in SQL (Structured Query Language)

 The CREATE TABLE statement is used to define a new table. To create the student relation, we can
use the following statement.
 Syntax:
create table <table name> (column definition 1, column definition 2, …);

SQL > CREATE TABLE CUSTOMERS

(CID INTEGER,

CNAME CHAR (30),

ACCNO INTEGER,

BNAME CHAR (40),

AMT REAL)

 Thus, the above statement creates the students relation shown in Figure below by accepting the
values from the user.

CID CNAME ACCNO BNAME AMT

6001 Kirpal Singh 41 UTI 15000.00

6312 Renuka 52 SBI 20000.00

6448 Ramesh 60 SBH 10500.00

6015 Raju 72 Andhra Bank 18050.00

 Alter A Table
 Alter Table statement is used to:-
o When a user wants to add a new column
o To change the width of a datatype or the datatype itself
o To include or drop integrity constraints.
 Syntax
alter table <table name> modify (column definition 1, column definition 2, …);

alter table <table name> add (column definition 1, column definition 2, …);

SQL> ALTER TABLE CUSTOMERS ADD (ADDRESS VARCHAR2 (25));

The out put of this command will be: Table altered

 Drop a column

 If there is no further use of a particular column in table, Oracle provides the facility to drop it using
the Drop Column command.
o Syntax:
alter table <table name> drop column (column definition 1, column definition 2, …;

SQL> ALTER TABLE CUSTOMERS DROP COLUMN ADDRESS;

The out put of this command will be: Table altered

 Truncate Table
 If there is no further use of records stored in a table and the structure has to be retained then the
records alone can be deleted.
o Syntax:
truncate table <table name>;

SQL> TRUNCATE TABLE CUSTOMERS;

 By adding reuse storage clause to the same command the space that is used for the storage can be
reclaimed.
o Syntax:
truncate table <table name> reuse storage;

SQL> TRUNCATE TABLE CUSTOMERS REUSE STORAGE;

 Command to view the table’s structure


 If the user wants to view the structure of the table.
o Syntax:
desc <table name>;

SQL> DESC CUSTOMERS;

The out put of this command will be:


Name Null? Type

-------------------- ----------- ------

CID INTEGER,

CNAME CHAR (30),

ACCNO INTEGER,

BNAME CHAR (40),

AMT REAL

 Inserting Tuples in a table


 Tuples are inserted using INSERT command in a table as

SQL > INSERT INTO CUSTOMERS

(CID, CNAME, ACCNO, BNAME, AMT) VALUES (6001, ‘Kirpal Singh’, 41, ‘UTI’,
15000.00);

 Note: Listing of columns is not compulsory.


 Deleting tuples from a table
 Tuples in the table can be deleted using a DELETE command in SQL as
SQL > DELETE FROM CUSTOMERS WHERE ACCNO=41;

 Modifying the Column values


 Values in a particular row can be changed using an UPDATE command as

SQL > UPDATE CUSTOMERS

SET CNAME = ‘Amarnath’,

WHERE CID = 6001;

Table has to be added.

 The key words WHERE and SET are used to determine the modifying rows and the modifying
procedure (i.e., how the rows can be changed)

 When a customer deposits some money in his account its instance can be updated as

SQL > UPDATE CUSTOMERS

SET AMT = AMT + 1000


WHERE ACCNO = 6001;

 Updated instance is shown in the figure

CID CNAME ACCNO BNAME AMT

6001 Amarnath 41 UTI 16000.00

6312 Renuka 52 SBI 20000.00

6448 Ramesh 60 SBH 10500.00

6015 Raju 72 Andhra Bank 18050.00

2.3 Integrity Constraints: Over Relations

 Integrity constraint is a condition that ensures the correct insertion of the data and prevents
unauthorized data access thereby pressuring the consistency of data. It means that, DBMS specifies
some conditions that must be satisfied while inserting and storing the data in database which
prevents the entry of incorrect information.

 Example: Fro example the roll number of a student cannot be a decimal value. The database enforces
the constraint that the instance of roll number. Can have only integer values.

 Integrity constraints are of following types:

o Entity Integrity Constraints / Key Constraints


o Referential Integrity Constraints/foreign Key constraints
o Domain Constraints
o Column Constraints
o User-defined integrity checks
o General Constraints
 Entity Integrity Constraints / Key Constraints
 Entity integrity constraint or key constraint specified the condition and restricts the data that can be
stored, only in one relation.

 Key Constraint
• A key constraint is a statement that a certain minimal fields of a relation has a unique identified
for all tuples.
• Actually key constraint is the general term, the term Candidate Key is used for satisfying the
constraints according to a key constraint.
 Candidate Key
• A set of fields that uniquely identifies a tuple according to a key constraint is called a candidate
key for the relation.
 Super Key
• A super Key is a superset of a candidate key. A super key is a set of fields, that each contains a
candidate key.
• For Example: The set of attributes fields {sid, login} is a super key. Here there is a constraint for
sid and there is a constraint for login separately. When we collect such types of fields with key
constraint i.e. candidate keys, then they form the super key.
 Specifying Key constraint in SQL
 In SQL, we can declare that column/fields of a table form a candidate key using two statements.
o Unique Key: The purpose of a unique key is to ensure that information in the column is Unique.
• The data held across the column MUST be UNIQUE; it must not be repeated across the
column.
• Any column can be left blank with NULL value.
• In the following example, no two students or more can have the same login. Bust atleast one
student can have NULL login.
o Primary Key: The purpose of Primary key is to ensure that information in the column is unique
and MUST be compulsory entered

• The data held across the column MUST be UNIQUE; it must not be repeated across the
column.
• Even one column also cannot be left blank; it must be compulsory entered i.e. the NOT NULL
attribute is active.
• In the above example, no two students or more can have the same sid and EVERY student
should have sid compulsorily.
 Let us revisit out example of table definition and specify key information:
SQL > CREATE TABLE students

(SID CHAR (20) Primary key;

NAME CHAR (30),

LOGIN CHAR (20) Unique key;

AGE INTEGER,

GPA REAL);

 In the above example, no two students or more can have the same login. Bust atleast one student
can have NULL login.

 Referential Integrity Constraints (or) Foreign Key Constraint


 Referential integrity constraints checks the conditions to be satisfied in more than one relation
connected with some relationship.
 For Example: If a relation includes a foreign key matching a primary key in some other relation, then
every value of the foreign key in that relation must be equal to the primary key of relation.
 Specifying Referential Integrity Constraint in SQL
 In SQL, we can declare that column/fields of a table form a referential integrity constraint using:
 FOREIGN KEY:
• Foreign keys represent relationship between tables. A foreign key is a column whose values
are derived form the primary key or unique key of some other table.
• The table in which the foreign key is defined is called a foreign table or detail table. The table
that defines the primary or unique key and is reference by the foreign key is called the Primary
table or Master table.
• The master table can be referenced in the foreign key definition by using the clause
references Master table – name when defining the foreign key column attributes in the
foreign table.
• For Example: the foreign key constraint states that every sid value in enrolled must also
appear in sid of students.
 Let us create the new relation Enrolled connected with the previous example student relation.

SQL > CREATE TABLE ENROLLED

(SID CHAR (20),

CHAR (20) Primary key,

GRADE CHAR (10),

FOREIGN KEY (SID) REFERENCE STUDENTS);

 Domain Constraints
 A relational schema specified the domain of each field or column in the relation.
 For Example: consider students relation
Students (sid: string, name: stirng, login: stirng, age: integer, gpa: real)

Here sid, name, login, age, gpa are column names with domains: stirng, string, string, integer, real
respectively.

These domains can have some constraints called as domain constraints.

 Column Constraints

 The value in any column of any table should be controlled by column constraints, which are
defined for that particular column, and recorded once and only once in central data dictionary.
 Example:
SQL > CREATE TABLE PERSONAL

(EMPNO SMALLINT

BONUSNO SMALLINT NOT NULL


BONUSAMT DECIMAL);

 User –defined Integrity Constraints

 This type of facility allows business rules to be applied centrally to the database, so the when a
certain action is performed on a set of data, other actions are automatically triggered, conforming
to some user-defined and centrally recorded set of rules.
 In general terms such facilities are implemented as triggers and take the form of a sequence of
three types of definitions:
o Define the condition which actions the trigger
o Specify the test to be made
o Specify the action to be taken if interest fails.

 General Constraints

 There are two types of general constraints.


o Table Constraints: Table constraints are applied on a particular table and are checked every
time that specific table is updated.
o Assertions: These assertions are applied an collection of tables and are checked every time
these table are updated

2.4 Query Languages:

 A query language is a language in which user requests to retrieve some information from the
database. The query languages are considered as higher-level languages than programming
languages also.

 Query language are of two types


o Procedural language
o Non-Procedural language.
 In procedural language, the user has to describe the specific procedure to retrieve the information
from the database.
 Example: the relational Algebra is a procedural language.
 In non-procedural languages, the user retrieves the information from the database without describing
the specific procedure to retrieve it.
 Example: The tuple relational calculus and the domain Relational; calculus are non procedural
languages.

 Determining Keys from E-R Sets

 Strong entity set. The primary key of the entity set becomes the primary key of the relation.
 Weak entity set. The primary key of the relation consists of the union of the primary key of the strong
entity set and the discriminator of the weak entity set.
 Relationship set. The union of the primary keys of the related entity sets becomes a super key of
the relation.
o For binary many-to-one relationship sets, the primary key of the “many” entity set becomes
the relation’s primary key.
o For one-to-one relationship sets, the relation’s primary key can be that of either entity set.
o For many-to-many relationship sets, the union of the primary keys becomes the relation’s
primary key

 Schema Diagram for the Banking Enterprise

 Views
 In some cases, it is not desirable for all users to see the contents of the entire model
 A relation that is not in the model but is made visible to a user as a “virtual relation” is called a view.
 Views not available in MySQL 3.23
 Prevent access to tables or columns using the access privilege system
 View Definition
 A view is defined using the create view statement which has the form
create view v as <query expression>

where <query expression> is any legal relational algebra query expression. The view name
is represented by v.

 Once a view is defined, the view name can be used to refer to the virtual relation that the view
generates.
 View definition is not the same as creating a new relation by evaluating the query expression
 Rather, a view definition causes the saving of an expression; the expression is substituted into
queries using the view.
 Tuples Inserted Into loan and borrower

 Views Defined Using Other Views


 One view may be used in the expression defining another view
 A view relation v1 is said to depend directly on a view relation v2 if v2 is used in the expression
defining v1
 A view relation v1 is said to depend on view relation v2 if either v1 depends directly to v2 or there
is a path of dependencies from v1 to v2
 A view relation v is said to be recursive if it depends on itself.

 View Expansion

 A way to define the meaning of views defined in terms of other views.


 Let view v1 be defined by an expression e1 that may itself contain uses of view relations.
 View expansion of an expression repeats the following replacement step:
repeat
Find any view relation vi in e1
Replace the view relation vi by the expression defining vi
until no more view relations are present in e1

 As long as the view definitions are not recursive, this loop will terminate
Unit-2

Relational Algebra

 Six basic operators


o select
o project
o union
o set difference
o Cartesian product
o rename
 The operators take two or more relations as inputs and give a new relation as a result.
 Select Operation
 Notation:  p(r)
 p is called the selection predicate
 Defined as:
p(r) = {t | t  r and p(t)}

Where p is a formula in propositional calculus consisting of terms connected by:  (and),  (or), 
(not)
Each term is one of:

<attribute> op <attribute> or <constant>

where op is one of: =,, >, . <. 

 Example of selection:
 branch-name=“Perryridge”(account)
 Project Operation

 Notation: A1, A2, …, Ak (r)


where A1, A2 are attribute names and r is a relation name.

 The result is defined as the relation of k columns obtained by erasing the columns that are not listed
 Duplicate rows removed from result, since relations are sets
 E.g. To eliminate the branch-name attribute of account
account-number, balance (account)

 Union Operation

 Notation: r  s
 Defined as: r  s = {t | t  r or t  s}
 For r  s to be valid.
1. r, s must have the same arity (same number of attributes)

2. The attribute domains must be compatible (e.g., 2nd column of r deals with the same type of values
as does the 2nd column of s)

 E.g. to find all customers with either an account or a loan


customer-name (depositor)  customer-name (borrower)
 Set Difference Operation

 Notation: r – s
 Defined as: r – s = {t | t  r and t  s}
 Set differences must be taken between compatible relations.
 r and s must have the same arity
 attribute domains of r and s must be compatible

 Cartesian-Product Operation

 Notation r x s
 Defined as: r x s = {t q | t  r and q  s}
 Assume that attributes of r(R) and s(S) are disjoint. (That is, R  S = ).
 If attributes of r(R) and s(S) are not disjoint, then renaming must be used.

 Composition of Operations

 Can build expressions using multiple operations


 Example: A=C(r x s)
 rxs

 A=C(r x s)
 Rename Operation
 Allows us to name, and therefore to refer to, the results of relational-algebra expressions.
 Allows us to refer to a relation by more than one name.
Example:  x (E)

returns the expression E under the name X

If a relational-algebra expression E has arity n, then

x (A1, A2, …, An) (E)

returns the result of expression E under the name X, and with the

attributes renamed to A1, A2, …, An.

 Banking Example

branch (branch-name, branch-city, assets)

customer (customer-name, customer-street, customer-only)

account (account-number, branch-name, balance)

loan (loan-number, branch-name, amount)

depositor (customer-name, account-number)

borrower (customer-name, loan-number)

 Example Queries
 Find all loans of over $1200
amount > 1200 (loan)

 Find the loan number for each loan of an amount greater than $1200
loan-number (amount > 1200 (loan))

 Find all loans of over $1200


amount > 1200 (loan)

 Find the names of all customers who have a loan, an account, or both, from the bank
customer-name (borrower)  customer-name (depositor)

 Find the names of all customers who have a loan and an account at bank.
customer-name (borrower)  customer-name (depositor)

 Find the loan number for each loan of an amount greater than $1200
loan-number (amount > 1200 (loan))

 Find the names of all customers who have a loan, an account, or both, from the bank
customer-name

 Find the names of all customers who have a loan and account at the bank
customer-name

 Find the names of all customers who have a loan at the Perryridge branch.

customer-name (branch-name=“Perryridge”

(borrower.loan-number = loan.loan-number(borrower x loan)))

 Find the names of all customers who have a loan at the Perryridge branch but do not have an
account at any branch of the bank.
customer-name (branch-name = “Perryridge”

(borrower.loan-number = loan.loan-number(borrower x loan))) – customer-name(depositor)

 Find the names of all customers who have a loan at the Perryridge branch.
 Query 1
customer-name(branch-name = “Perryridge” (
borrower.loan-number = loan.loan-number(borrower x loan)))
 Query 2
customer-name(loan.loan-number = borrower.loan-number(
(branch-name = “Perryridge”(loan)) x borrower))

 Find the largest account balance: Rename account relation as d


The query is:

balance(account) - account.balance

(account.balance < d.balance (account x rd (account)))

 Formal Definition
 A basic expression in the relational algebra consists of either one of the following:
o A relation in the database
o A constant relation
 Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra
expressions:
o E1  E2
o E1 - E2
o E1 x E2
o p (E1), P is a predicate on attributes in E1
o s(E1), S is a list consisting of some of the attributes in E1
o  x (E1), x is the new name for the result of E1
 Additional Operations
 We define additional operations that do not add any power to the relational algebra, but that
simplify common queries.
 Set intersection
 Natural join
 Division
 Assignment
 Set-Intersection Operation
 Notation: r  s
 Defined as:
 r  s ={ t | t  r and t  s }
 Assume:
o r, s have the same arity
o attributes of r and s are compatible
 Note: r  s = r - (r - s)
 Natural-Join Operation
 Notation: r s

 Let r and s be relations on schemas R and S respectively.


Then, r s is a relation on schema R  S obtained as follows:
 Consider each pair of tuples tr from r and ts from s.
 If tr and ts have the same value on each of the attributes in R  S, add a tuple t to the
result, where
• t has the same value as tr on r
• t has the same value as ts on s
 Example:
R = (A, B, C, D)

S = (E, B, D)

 Result schema = (A, B, C, D, E)


 r s is defined as:
r.A, r.B, r.C, r.D, s.E (r.B = s.B  r.D = s.D (r x s))
 Division Operation
 Notation: r  s

 Suited to queries that include the phrase “for all”.


 Let r and s be relations on schemas R and S respectively where
 R = (A1, …, Am, B1, …, Bn)
 S = (B1, …, Bn)
The result of r  s is a relation on schema

R – S = (A1, …, Am)

r  s = { t | t   R-S(r)   u  s ( tu  r ) }

 Property
 Let q – r  s
 Then q is the largest relation satisfying q x s  r

 Definition in terms of the basic algebra operation


Let r(R) and s(S) be relations, and let S  R

r  s = R-S (r) –R-S ( (R-S (r) x s) – R-S,S(r))

To see why

 R-S,S(r) simply reorders attributes of r


 R-S(R-S (r) x s) – R-S,S(r)) gives those tuples t in

R-S (r) such that for some tuple u  s, tu  r.


 Assignment Operation

 The assignment operation () provides a convenient way to express complex queries.
 Write query as a sequential program consisting of
• a series of assignments
• followed by an expression whose value is displayed as a result of the query.
 Assignment must always be made to a temporary relation variable.
 Example: Write r  s as
temp1  R-S (r)
temp2  R-S ((temp1 x s) – R-S,S (r))
result = temp1 – temp2

 The result to the right of the  is assigned to the relation variable on the left of the.
 May use variable in subsequent expressions.

 Example Queries

 Find all customers who have an account from at least the “Downtwon” and the Uptwon” branches.

Query 1

CN(BN=“Downtwon”(depositor account)) 

CN(BN=“Uptwon”(depositor account))

where CN denotes customer-name and BN denotes


branch-name.

Query 2

customer-name, branch-name (depositor account)


 temp(branch-name) ({(“Downtwon”), (“Uptwon”)})

 Find all customers who have an account at all branches located in Brooklyn city.

customer-name, branch-name (depositor account)


 branch-name (branch-city = “Brooklyn” (branch))

 Modification of the Database

 The content of the database may be modified using the following operations:
 Deletion
 Insertion
 Updating
 All these operations are expressed using the assignment operator.
 Deletion

 A delete request is a query where the selected tuples are removed from the database.
 Can only delete whole tuples; cannot delete values on only particular attributes
 A deletion is expressed in relational algebra by:
rr–E

where r is a relation and E is a relational algebra query.

 NB E might be a constant relation specifying a single tuple to be deleted

 Deletion Examples

 Delete all account records in the Perryridge branch.


acc

 D
loan 

 Delete all accounts at branches located in Needham.


r1   branch-city = “Needham” (account branch)

r2  branch-name, account-number, balance (r1)

r3   customer-name, account-number (r2 depositor)

account  account – r2

depositor  depositor – r3

 Insertion

 To insert data into a relation, we either:


 specify a tuple to be inserted
 write a query whose result is a set of tuples to be inserted
 in relational algebra, an insertion is expressed by:
r r  E

where r is a relation and E is a relational algebra expression.

 The insertion of a single tuple is expressed by letting E be a constant relation containing one tuple.
 Insertion Examples
 Insert information in the database specifying that Smith has $1200 in account A-973 at the
Perryridge branch.
account  account  {(“Perryridge”, A-973, 1200)}

depositor

 Provide as a gift for all loan customers in the Perryridge branch, a $200 savings account. Let the loan
number serve as the account number for the new savings account.
r1  (branch-name = “Perryridge” (borrower loan))

account  account  branch-name, account-number,200 (r1)

depositor  depositor  customer-name, loan-number(r1)

 Updating

 A mechanism to change a value in a tuple without charging all values in the tuple
 Use the generalized projection operator to do this task
r   F1, F2, …, FI, (r)

 Each Fi is either
 the ith attribute of r, if the ith attribute is not updated, or,
 if the attribute is to be updated Fi is an expression, involving only constants and the
attributes of r, which gives the new value for the attribute

 Update Examples

 Make interest payments by increasing all balances by 5 percent.

account   AN, BN, BAL * 1.05 (account)

where AN, BN and BAL stands for account number, branch name

 Pay all accounts with balances over $10,000 6 % interest and pay all others 5%
account   AN, BN, BAL * 1.06 ( BAL  10000 (account))
 AN, BN, BAL * 1.05 (BAL  10000 (account))

Tuple Relational Calculus

 A nonprocedural query language, where each query is of the form


{t | P (t) }

 It is the set of all tuples t such that predicate P is true for t


 t is a tuple variable, t[A] denotes the value of tuple t on attribute A
 t  r denotes that tuple t is in relation r
 P is a formula similar to that of the predicate calculus

 Predicate Calculus Formula

1. Set of attributes and constants

2. Set of comparison operators: (e.g., , , , , , )

3. Set of connectives: and (), or (v)‚ not ()

4. Implication (): x  y, if x if true, then y is true

x  y x v y

5. Set of quantifiers:

  t  r (Q(t))  ”there exists” a tuple in t in relation r


such that predicate Q(t) is true
 t  r (Q(t))  Q is true “for all” tuples t in relation r

 Banking Example

 branch (branch-name, branch-city, assets)


 customer (customer-name, customer-street, customer-city)
 account (account-number, branch-name, balance)
 loan (loan-number, branch-name, amount)
 depositor (customer-name, account-number)
 borrower (customer-name, loan-number)

 Example Queries
 Find the loan-number, branch-name, and amount for loans of over $1200
t | t  loan  t [amount]  1200}

 Find the loan number for each loan of an amount greater than $1200
{t |  s  loan (t[loan-number] = s[loan-number]  s [amount]  1200)}

Notice that a relation on schema [loan number] is implicitly defined

 Find the names of all customers having a loan, an account, or both at the bank
{t | s  borrower( t[customer-name] = s[customer-name])

 Find the names of all customers who have a loan and an account at the bank
{t | s  borrower( t[customer-name] = s[customer-name])
 u  depositor( t[customer-name] = u[customer-name])

 Find the names of all customers having a loan at the Perryridge branch
{t | s  borrower(t[customer-name] = s[customer-name]
 u  loan(u[branch-name] = “Perryridge”
 u[loan-number] = s[loan-number]))}

 Find the names of all customers who have a loan at the Perryridge branch, but no account at any
branch of the bank

{t | s  borrower( t[customer-name] = s[customer-name]


 u  loan(u[branch-name] = “Perryridge”
 u[loan-number] = s[loan-number]))
 not v  depositor (v[customer-name] =
t[customer-name]) }

 Find the names of all customers having a loan from the Perryridge branch, and the cities they live
in

{t | s  loan(s[branch-name] = “Perryridge”
 u  borrower (u[loan-number] = s[loan-number]
 t [customer-name] = u[customer-name])
  v  customer (u[customer-name] = v[customer-name]
 t[customer-city] = v[customer-city])))}

 Find the names of all customers who have an account at all branches located in Brooklyn:
{t |  c  customer (t[customer.name] = c[customer-name]) 
 s  branch(s[branch-city] = “Brooklyn” 
 u  account ( s[branch-name] = u[branch-name]
  s  depositor ( t[customer-name] = s[customer-name]
 s[account-number] = u[account-number] )) )}

 Safety of Expressions

 It is possible to write tuple calculus expressions that generate infinite relations.


• For example, {t |  t  r} results in an infinite relation if the domain of any attribute
of relation r is infinite
 To guard against the problem, we restrict the set of allowable expressions to safe expressions.
 An expression {t | P(t)} in the tuple relational calculus is safe if every component of t appears in
one of the relations, tuples, or constants that appear in P
• NOTE: this is more than just a syntax condition.
• E.g. { t | t[A]=5  true } is not safe --- it defines an infinite set with attribute
values that do not appear in any relation or tuples or constants in P.

Domain Relational Calculus

 A nonprocedural query language equivalent in power to the tuple relational calculus


 Each query is an expression of the form: {  x1, x2, …, xn  | P(x1, x2, …, xn)}

 x1, x2, …, xn represent domain variables


 P represents a formula similar to that of the predicate calculus
 Example Queries
 Find the loan-number, branch-name, and amount for loans of over $1200
{ l, b, a  |  l, b, a   loan  a > 1200}

 Find the names of all customers who have a loan of over $1200
{ c  |  l, b, a ( c, l   borrower   l, b, a   loan  a > 1200)}

 Find the names of all customers who have a loan from the Perryridge branch and the loan amount:
{ c, a  |  l ( c, l   borrower  b( l, b, a   loan 

b = “Perryridge”))}

or { c, a  |  l ( c, l   borrower   l, “Perryridge”, a   loan)}

 Find the names of all customers having a loan, an account, or both at the Perryridge branch:
{ c  |  l ({ c, l   borrower
  b,a( l, b, a   loan  b = “Perryridge”))
  a( c, a   depositor
  b,n( a, b, n   account  b = “Perryridge”))}
 Find the names of all customers who have an account at all branches located in Brooklyn:

{ c  |  s, n ( c, s, n   customer) 

 x,y,z( x, y, z   branch  y = “Brooklyn”) 


 a,b( x, y, z   account   c,a   depositor)}

 Safety of Expressions

 {  x1, x2, …, xn  | P(x1, x2, …, xn)} is safe if all of the following hold:

1. All values that appear in tuples of the expression are values from dom(P) (that is, the values
appear either in P or in a tuple of a relation mentioned in P).

2. For every “there exists” subformula of the form  x (P1(x)), the subformula is true if an only
if P1(x) is true for all values x from dom(P1).

3. For every “for all” subformula of the form x (P1 (x)), the subformula is true if and only if P1(x)
is true for all values x from dom(P1).

SQL
 Basic Structure
 Set Operations
 Aggregate Functions
 Null Values
 Nested Subqueries
 Derived Relations
 Views
 Modification of the Database
 Joined Relations
 Data Definition Language
 Schema Used in Examples

3.1 Basic Structure

 SQL is based on set and relational operations with certain modifications and
enhancements
 A typical SQL query has the form:

select A1, A2, ..., An


from r1, r2, ..., rm
where P

 Ais represent attributes


 ris represent relations
 P is a predicate.

 This query is equivalent to the relational algebra expression.


 A1, A2, ..., An(P (r1 x r2 x ... x rm))

 The result of an SQL query is a relation.

 The select Clause


 The select clause list the attributes desired in the result of a query
 corresponds to the projection operation of the relational algebra
 E.g. find the names of all branches in the loan relation
select branch-name
from loan
 In the “pure” relational algebra syntax, the query would be:
branch-name (loan)

 NOTE: SQL does not permit the ‘-’ character in names,


 Use, e.g., branch_name instead of branch-name in a real implementation.
 We use ‘-’ since it looks nicer!
 NOTE: SQL names are case insensitive, i.e. you can use capital or small letters.
 You may wish to use upper case where-ever we use bold font.
 SQL allows duplicates in relations as well as in query results.
 To force the elimination of duplicates, insert the keyword distinct after select.
 Find the names of all branches in the loan relations, and remove duplicates
select distinct branch-name
from loan

 The keyword all specifies that duplicates not be removed.


select all branch-name
from loan

 An asterisk in the select clause denotes “all attributes”


select *
from loan

 The select clause can contain arithmetic expressions involving the operation, +, –, ,
and /, and operating on constants or attributes of tuples.
 The query:
select loan-number, branch-name, amount  100
from loan

Would return a relation which is the same as the loan relations, except that the
attribute amount is multiplied by 100.

 The where Clause


 The where clause specifies conditions that the result must satisfy
 Corresponds to the selection predicate of the relational algebra.
 To find all loan number for loans made at the Perryridge branch with loan amounts
greater than $1200.
select loan-number
from loan
where branch-name = ‘Perryridge’ and amount > 1200
 Comparison results can be combined using the logical connectives and, or, and not.
 Comparisons can be applied to results of arithmetic expressions.
 SQL includes a between comparison operator
 E.g. Find the loan number of those loans with loan amounts between $90,000 and
$100,000 (that is, $90,000 and $100,000)

select loan-number
from loan
where amount between 90000 and 100000

 The from Clause


 The from clause lists the relations involved in the query
 corresponds to the Cartesian product operation of the relational algebra.
 Find the Cartesian product borrower x loan
select 
from borrower, loan
 Find the name, loan number and loan amount of all customers
having a loan at the Perryridge branch.
select customer-name, borrower.loan-number, amount
from borrower, loan
where borrower.loan-number = loan.loan-number and
branch-name = ‘Perryridge’

 The Rename Operation


 The SQL allows renaming relations and attributes using the as clause:
old-name as new-name
 Find the name, loan number and loan amount of all customers; rename the column
name loan-number as loan-id.
select customer-name, borrower.loan-number as loan-id, amount
from borrower, loan
where borrower.loan-number = loan.loan-number

 Tuple Variables
 Tuple variables are defined in the from clause via the use of the as clause.
 Find the customer names and their loan numbers for all customers having a loan at
some branch.
select customer-name, T.loan-number, S.amount
from borrower as T, loan as S
where T.loan-number = S.loan-number

 Find the names of all branches that have greater assets than some branch located in
Brooklyn.

select distinct T.branch-name


from branch as T, branch as S
where T.assets > S.assets and S.branch-city = ‘Brooklyn’

String Operations
 SQL includes a string-matching operator for comparisons on character strings.
Patterns are described using two special characters:
 percent (%). The % character matches any substring.
 underscore (_). The _ character matches any character.

 Find the names of all customers whose street includes the substring “Main”.
select customer-name
from customer
where customer-street like ‘%Main%’

 Match the name “Main%”


like ‘Main\%’ escape ‘\’

 SQL supports a variety of string operations such as


 concatenation (using “||”)
 converting from upper to lower case (and vice versa)
 finding string length, extracting substrings, etc.

 Ordering the Display of Tuples

 List in alphabetic order the names of all customers having a loan in Perryridge
branch
select distinct customer-name
from borrower, loan
where borrower loan-number - loan.loan-number and
branch-name = ‘Perryridge’
order by customer-name

 We may specify desc for descending order or asc for ascending order, for each
attribute; ascending order is the default.
 E.g. order by customer-name desc
Set Operations
 union, intersect, and except
 operate on relations
 correspond to the relational algebra operations 
 Each of the above operations eliminates duplicates
 to retain all duplicates use: union all, intersect all and except all.
 Suppose a tuple occurs m times in r and n times in s, then, it occurs:
 m + n times in r union all s
 min(m,n) times in r intersect all s
 max(0, m – n) times in r except all s
 Find all customers who have a loan, an account, or both:
(select customer-name from depositor)
union
(select customer-name from borrower)

 Find all customers who have both a loan and an account.


(Select customer-name from depositor)
intersect
(select customer-name from borrower)

 Find all customers who have an account but no loan.


(select customer-name from depositor)
except
(select customer-name from borrower)

Aggregate Functions

 These functions operate on the multiset of values of a column of a relation, and


return a value: avg, min, max, sum, count

 Find the average account balance at the Perryridge branch.


select avg (balance)
from account
where branch-name = ‘Perryridge’

 Find the number of tuples in the customer relation.


select count (*)
from customer

 Find the number of depositors in the bank.


select count (distinct customer-name)
from depositor

 Find the number of depositors for each branch.


select branch-name, count (distinct customer-name)
from depositor, account
where depositor.account-number = account.account-number
group by branch-name

 Find the names of all branches where the average account balance is more than
$1,200.
select branch-name, avg (balance)
from account
group by branch-name
having avg (balance) > 1200

Note: Predicates in the having clause are applied after the formation of groups
whereas predicates in the where clause are applied before forming groups.

Null Values
 Attributes may have the null value
 The predicate is null can be used to check for null values.
 E.g. Find all loan number which appear in the loan relation with null values
for amount.
select loan-number
from loan
where amount is null

 The result of any arithmetic expression involving null is null


 Total all loan amounts
select sum (amount)
from loan

 Above statement ignores null amounts.


 result is null if all amounts are null.
 All aggregate operations except count(*) ignore tuples with null values on the
aggregated attributes.
Nested Sub queries
 SQL provides a mechanism for the nesting of subqueries.
 A subquery is a select-from-where expression that is nested within another query.
 Not implemented in MySQL 3.23 (version used in 351)
 need to know about these anyway
 can be simulated
• using temporary tables
• using a procedural language to make SQL calls
 A common use of subqueries is to perform tests for set membership, set comparisons,
and set cardinality.
 Example Nested Subquery
 Find all customers who have both an account and a loan at the bank.
select distinct customer-name
from borrower
where customer-name in (select customer-name
from depositor)

 Find all customers who have a loan at the bank but do not have
an account at the bank
select distinct customer-name
from borrower
where customer-name not in (select customer-name
from depositor)

Set Comparison
 Find all branches that have greater assets than some branch located in Brooklyn.
select distinct T.branch-name
from branch as T, branch as S
where T.assets > S.assets and
S.branch-city = ‘Brooklyn’

 Same query using > some clause


select branch-name
from branch
where assets > some
(select assets
from branch
where branch-city = ‘Brooklyn’)

 Definition of Some Clause

 F <comp> some r  t  r s.t. (F <comp> t) Where <comp> can be:  
(= some)  in However, ( some)  not in

 Definition of all Clause


 F <comp> all r   t  r (F <comp> t)

( all)  not in However, (= all)  in

 Example Query
 Find the names of all branches that have greater assets than all branches located in
Brooklyn.

select branch-name
from branch
where assets > all
(select assets
from branch
where branch-city = ‘Brooklyn’)

 Test for Empty Relations


 The exists construct returns the value true if the argument subquery is nonempty.
 exists r  r  Ø
 not exists r  r = Ø

 Example Query
 Find all customers who have an account at all branches located in Brooklyn.
select distinct S.customer-name
from depositor as S
where not exists (
(select branch-name
from branch
where branch-city = ‘Brooklyn’)
except
(select R.branch-name
from depositor as T, account as R
where T.account-number = R.account-number and
S.customer-name = T.customer-name))

 Note that X – Y = Ø  X  Y
 Note: Cannot write this query using = all and its variants

Views
 Provide a mechanism to hide certain data from the view of certain users. To create a
view we use the command:

create view v as <query expression>

where:

<query expression> is any legal expression

The view name is represented by v

 Example Queries

 A view consisting of branches and their customers

create view all-customer as


(select branch-name, customer-name
from depositor, account
where depositor.account-number = account.account-number) union
(select branch-name, customer-name
from borrower, loan
where borrower.loan-number = loan.loan-number)

 Find all customers of the Perryridge branch


select customer-name
from all-customer
where branch-name = ‘Perryridge’
Derived Relations

 Find the average account balance of those branches where the average account
balance is greater than $1200.
select branch-name, avg-balance
from (select branch-name, avg (balance)
from account
group by branch-name)
as result (branch-name, avg-balance)
where avg-balance > 1200

 Note that we do not need to use the having clause


 we compute the temporary (view) relation result in the from clause
 the attributes of result can be used directly in the where clause

 Modification of the Database – Deletion

 Delete all account records at the Perryridge branch


delete from account
where branch-name = ‘Perryridge’

 Delete all accounts at every branch located in Needham city.


delete from account
where branch-name in (select branch-name
from branch
where branch-city = ‘Needham’)
delete from depositor
where account-number in
(select account-number
from branch, account
where branch-city = ‘Needham’
and branch.branch-name = account.branch-name)

 Example Query

 Delete the record of all accounts with balances below the average at the bank.

delete from account


where balance < (select avg (balance)
from account)
• Problem: as we delete tuples from deposit, the average balance changes.

• Solution used in SQL:


1. First, compute avg balance and find all tuples to delete

2. Next, delete all tuples found above (without recomputing avg or retesting
the tuples)

 Modification of the Database – Insertion


 Add a new tuple to account
insert into account
values (‘A-9732’, ‘Perryridge’,1200)
or equivalently

insert into account (branch-name, balance, account-number)


values (‘Perryridge’, 1200, ‘A-9732’)

 Add a new tuple to account with balance set to null


insert into account
values (‘A-777’, ‘Perryridge’, null)

 Provide as a gift for all loan customers of the Perryridge branch, a $200 savings
account. Let the loan number serve as the account number for the new savings
account
insert into account
select loan-number, branch-name, 200
from loan
where branch-name = ‘Perryridge’
insert into depositor
select customer-name, loan-number
from loan, borrower
where branch-name = ‘Perryridge’
and loan.account-number = borrower.account-number

 The select from where statement is fully evaluated before any of its results are
inserted into the relation (otherwise queries like
insert into table1 select * from table1
would cause problems
 Modification of the Database – Updates
 Increase all accounts with balances over $10,000 by 6%, all other accounts receive 5%.
 Write two update statements:
update account
set balance = balance  1.06
where balance > 10000
update account
set balance = balance  1.05
where balance  10000

 The order is important


 Can be done better using the case statement

 Case Statement for Conditional Updates


 Same query as before: Increase all accounts with balances over $10,000 by 6%, all
other accounts receive 5%.
update account
set balance = case
when balance <= 10000 then balance *1.05
else balance * 1.06
end

 Update of a View
 Create a view of all loan data in loan relation, hiding the amount attribute
create view branch-loan as
select branch-name, loan-number
from loan

 Add a new tuple to branch-loan


insert into branch-loan
values (‘Perryridge’, ‘L-307’)

This insertion must be represented by the insertion of the tuple

(‘L-307’, ‘Perryridge’, null) into the loan relation

 Updates on more complex views are difficult or impossible to translate, and hence
are disallowed.
 Most SQL implementations allow updates only on simple views (without aggregates)
defined on a single relation

Data Definition Language (DDL)

Allows the specification of not only a set of relations but also information about each
relation, including:

 The schema for each relation.


 The domain of values associated with each attribute.
 Integrity constraints
 The set of indices to be maintained for each relations.
 Security and authorization information for each relation.
 The physical storage structure of each relation on disk.

 SQL Data Definition for Part of the Bank Database


 Domain Types in SQL

 char(n). Fixed length character string, with user-specified length n.
 varchar(n). Variable length character strings, with user-specified maximum length n.
 int. Integer (a finite subset of the integers that is machine-dependent).
 smallint. Small integer (a machine-dependent subset of the integer domain type).
 numeric(p,d). Fixed point number, with user-specified precision of p digits, with n
digits to the right of decimal point.
 real, double precision. Floating point and double-precision floating point numbers,
with machine-dependent precision.
 float(n). Floating point number, with user-specified precision of at least n digits.
 Null values are allowed in all the domain types. Declaring an attribute to be not null
prohibits null values for that attribute.
 create domain construct in SQL-92 creates user-defined domain types
create domain person-name char(20) not null

 Date/Time Types in SQL

 date. Dates, containing a (4 digit) year, month and date


 E.g. date ‘2001-7-27’
 time. Time of day, in hours, minutes and seconds.
 E.g. time ’09:00:30’ time ’09:00:30.75’
 timestamp: date plus time of day
 E.g. timestamp ‘2001-7-27 09:00:30.75’
 Interval: period of time
 E.g. Interval ‘1’ day
 Subtracting a date/time/timestamp value from another gives an interval
value
 Interval values can be added to date/time/timestamp values
 Can extract values of individual fields from date/time/timestamp
 E.g. extract (year from r.starttime)
 Can cast string types to date/time/timestamp
 E.g. cast <string-valued-expression> as date

 Create Table Construct


 An SQL relation is defined using the create table command:
create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))

 r is the name of the relation


 each Ai is an attribute name in the schema of relation r
 Di is the data type of values in the domain of attribute Ai
 Example:
create table branch
(branch-name char(15) not null,
branch-city char(30),
assets integer)

 Integrity Constraints in Create Table


 not null
 primary key (A1, ..., An)
 check (P), where P is a predicate
Example: Declare branch-name as the primary key for branch and ensure that the
values of assets are non-negative.

create table branch


(branch-name char(15),
branch-city char(30)
assets integer,
primary key (branch-name),
check (assets >= 0))

primary key declaration on an attribute automatically ensures not null in SQL-92


onwards, needs to be explicitly stated in SQL-89

 Drop and Alter Table Constructs


 The drop table command deletes all information about the dropped relation from the
database.
 The alter table command is used to add attributes to an existing relation.
alter table r add A D

where A is the name of the attribute to be added to relation r and D is the domain of
A.

 All tuples in the relation are assigned null as the value for the new
attribute.

 The alter table command can also be used to drop attributes of a relation
alter table r drop A
where A is the name of an attribute of relation r
 Dropping of attributes not supported by many databases

Joins

 Join types:
 Inner join
 Left outer join
 Right outer join
 Full outer join
 Join conditions:
 Natural
 On <predicate>
 Using (A1, …, An)

The loan and borrower Relations

loan INNER JOIN borrower ON loan.loan-number = borrower.loan-number


loan LEFT OUTER JOIN borrower USING (loan-number )

 Join Conditions
 a INNER JOIN b USING (c1, ..., cn)

 Equivalent: a INNER JOIN b ON (a.c1 = b.c1, …, a.cn = b.cn)

 Suppose (c1, …, cn) is a complete list of attributes common to A and B


 Then: a INNER JOIN b USING (c1, …, cn)
is equivalent to: a NATURAL JOIN b
 E.g. since there is only one common attribute between loan and borrower, the
following queries give the same result:
SELECT * FROM loan INNER JOIN borrower USING (loan_number);

SELECT * FROM loan NATURAL JOIN borrower;

 Example: Our languages

Query: What branches are in Brooklyn?

branch-name (branch-city = “Brooklyn” (branch))

{t |  s  branch (t[branch-name] = s[branch-name]


 s [branch-city] = “Brooklyn”)}

{ n  |  c, a ( n, c, a   branch  c = “Brooklyn”)}

SELECT branch-name FROM branch


WHERE branch_city = “Brooklyn”

 Join tricks
 Restrict a to just those rows compatible with b
 SELECT DISTINCT a.* FROM a NATURAL JOIN b;
 E.g. list tuples from the borrower table only for those borrowers who are depositors:
 SELECT DISTINCT borrower.*
FROM borrower NATURAL JOIN depositor;

 Compute set difference without using EXCEPT


(MySQL 3.23 doesn’t have set operations)
i.e.remove from a those tuples compatible with b
 SELECT DISTINCT a.* FROM a NATURAL LEFT OUTER JOIN b WHERE b.x is
NULL
 E.g. list those tuples from the borrower table only for those borrowers who are not
depositors:
 SELECT DISTINCT borrower.*
FROM borrower NATURAL LEFT OUTER JOIN depositor
WHERE depositor.customer_name is NULLS
 Review
 A NATURAL JOIN uses common column headings, and it is not possible to attach an
ON or USING clause
 An INNER JOIN results in a table where each tuple is a combination of information
from both argument tables
 An OUTER JOIN is an INNER JOIN padded with additional tuples
 An EQUI JOIN is an inner join with a condition that certain attributes have the same
value
 The ON clause lists common attributes which must be equal on both sides of the join
 The USING clause permits us to enforce equality between attributes having different
names, e.g. ON a.x = b.y
 It also permits us to use inequalities, e.g. ON a.x > b.y

 PS: Theta Join

A ⋈ C B ≝ C (A ⋈ B)

 For notational convenience, we permit ourselves to write a select condition on the


join operator
 This is called “theta join” because the condition is sometimes represented as a theta
subscript:
⋈θ
Example Nested Query

 Find all customers who have an account at all branches located in Brooklyn.

select distinct S.customer-name


from depositor as S
where not exists (
(select branch-name
from branch
where branch-city = ‘Brooklyn’)
except
(select R.branch-name
from depositor as T, account as R
where T.account-number = R.account-number and
S.customer-name = T.customer-name))
 Note that X – Y = Ø  X  Y
 Note: Cannot write this query using = all and its variants

 Find all branches located in Brooklyn.

SELECT branch-name
FROM branch
WHERE branch-city = ‘Brooklyn’

branch-name (branch-city = `Brooklyn’ (branch))

r1 = {t |  s  branch (t[branch-name] = s[branch-name]


 s [branch-city] = “Brooklyn”)}

 List all depositors and their branches

SELECT R.branch-name, T.customer-name


FROM depositor AS T,
account AS R
WHERE T.account-number
= R.account-number

branch-name, customer-name (depositor ⋈ account)

r2 = {t |  d  depositor, a  account
(t[branch-name] = a[branch-name]
 t[customer-name] = d[customer-name]
 d[account-number] = a[account-number]) }

 Find all customers who have an account at all branches located in Brooklyn

branch-name, customer-name (depositor ⋈ account)

÷ amount (branch-city = `Brooklyn’ (branch))


 Find all customers who have an account at all branches located in Brooklyn
r1 = {t |  s  branch (t[branch-name] = s[branch-name]
 s [branch-city] = “Brooklyn”)}

r2 = {t |  d  depositor, a  account
(t[branch-name] = a[branch-name]
 t[customer-name] = d[customer-name])
 d[account-number] = a[account-number]) }

{t |  d  depositor (t[customer-name] = d[customer-name]


 ( b  r1 →  s  r2
(s[branch-name] = b[branch-name]
s[customer-name] = d[customer-name]))) }

 Exercises:
 write this as a single expression
 convert it into an expression in the domain relational calculus

Integrity and Security


 Domain Constraints  Triggers
 Referential Integrity  Authorization
 Assertions  Authorization in SQL
• constraint ensures
hourly-wage is
 Domain Constraints greater than 4.00
 The clause constraint
 Integrity constraints: value-test is optional
 guard against accidental • useful to indicate
damage to the database which constraint
 ensure that authorized was violated by an
changes to the database do update
not result in a loss of data  Can have complex conditions in
consistency. domain check
 Domain constraints: an elementary  create domain
form of integrity constraint AccountType char(10)
 test values inserted in the constraint account-type-
database test
 test queries to ensure that check (value in
any comparisons make (‘Checking’, ‘Saving’))
sense  check (branch-name in
 New domains can be created from (select branch-name from
existing data types branch))
 E.g. create domain Dollars
numeric(12, 2)
 Referential Integrity
create domain Pounds
numeric(12,2)  Ensures that:
 IF a value appears in one
 We cannot assign or compare a
relation for a given set of
value of type Dollars to a value of
attributes
type Pounds.
 THEN it also appears for a
 However, we can convert
certain set of attributes in
type as below
another relation
(cast r.A as Pounds)
• E.g.: If “Perryridge”
(Should also multiply by
is a branch name
the dollar-to-pound
appearing in one of
conversion-rate)
the tuples in the
 The check clause permits domains
account relation,
to be restricted:
• then there exists a
 E.g. Use check clause to
tuple in the branch
ensure that an hourly-
relation for branch
wage domain allows only
“Perryridge”.
values greater than a
specified value.
create domain hourly-wage
numeric(5,2)
constraint value-test
check(value > = 4.00)
 Referential Integrity: Formal Definition

 Let r1(R1) and r2(R2) be relations with primary keys K1 and K2 respectively.
 The subset  of R2 is a foreign key referencing K1 in relation r1
 if for every t2 in r2
 there must be a tuple t1 in r1
 such that t1[K1] = t2[].

 This constraint is also called subset dependency


 it can be written as  (r2)  K1 (r1)

 Referential Integrity: Example

 E.g.
 r1 = loan(loan-number, branch-name, amount)
 r2 = borrower(customer-name, loan-number)
 K1 = loan-number, K2 = customer-name

 The loan-number attribute of the borrower relation is a foreign key referencing loan-
number in the loan relation because
 for every tuple b in borrower
 there is a tuple l in loan
 such that loan[loan-number] = borrower[loan-number].

 Observe also that


 loan-number (borrower)  loan-number (loan)
 Referential Integrity in the E-R Model

 Consider relationship set R between entity sets E1 and E2


 schema for R includes the primary keys K1 of E1 and K2 of E2
 K1 and K2 form foreign keys on the schemas for E1 and E2 resp

 Weak entity sets are a source of referential integrity constraints


 the relation schema for a weak entity set must include the primary key attributes
of the entity set on which it depends

 Checking Referential Integrity

 Suppose we want to preserve the following referential integrity constraint:


 (r2)  K (r1)

 Under what circumstances do we need to check this constraint?


 Consider: insert, delete, update...

 Insert: Suppose a tuple t2 is inserted into r2


 the system must ensure that there is a tuple t1 in r1
 such that t1[K] = t2[]
 i.e. t2 []  K (r1)

 E.g. loan-number (borrower)  loan-number (loan)


 Insert: Suppose we insert (Brown, L-51) into borrower
 the system must ensure
borrower[loan-number] = L-51  loan-number (loan)
 reject insertion, or add tuple to loan at the same time
 Delete: If a tuple, t1 is deleted from r1
 the system must compute the tuples in r2 that reference t1:
 = t1[K] (r2)

 If this set is non-empty, either:


 reject the delete command as an error, or
 delete those tuples that reference t1
 NB this may lead to cascading deletions
 E.g. loan-number (borrower)  loan-number (loan)
 Delete: Suppose we delete (L-11, Round Hill, 900) from loan
 compute the tuples in borrower that reference this tuple:
 {(L-11, Round Hill, 900)}[loan-number] = L-11
 loan-number = L-11 (borrower)

 loan-number = L-11 (borrower) = (Smith, L-11)


 This set is non-empty, so:
 reject the delete command as an error, or
 delete (Smith, L-11)
• possibly delete tuples in other relations that reference (Smith, L-11)
 Update. There are two cases:
1. tuple t2 is updated in relation r2 modifying values for foreign key 
 a test similar to the insert case is made
 Let t2’ denote the new value of tuple t2
 The system must ensure that t2’[]  K(r1)
2. tuple t1 is updated in r1 modifying values for the primary key (K)
 a test similar to the delete case is made
 Note that t1 denotes the old value of tuple t1
 The system must compute  = t1[K] (r2)
 If this set is not empty
 the update may be rejected as an error, or
 the update may be cascaded to the tuples in the set, or
 the tuples in the set may be deleted.
 Referential Integrity in SQL

 Primary and candidate keys and foreign keys can be specified as part of the SQL create table
statement:
 The primary key clause:
• attributes that comprise the primary key
 The unique key clause:
• attributes that comprise a candidate key
 The foreign key clause:
• attributes that comprise the foreign key, and
• the name of the relation referenced by the foreign key
 By default, a foreign key references the primary key attributes of the referenced table
foreign key (account-number) references account

 Short form for specifying a single column as foreign key


account-number char (10) references account

 Reference columns in the referenced table can be explicitly specified


 but must be declared as primary/candidate keys
foreign key (account-number) references account(account-number)
 Referential Integrity in SQL - Example

create table customer


(customer-name char(20),
customer-street char(30),
customer-city char(30),
primary key (customer-name))

create table branch


(branch-name char(15),
branch-city char(30),
assets integer,
primary key (branch-name))

create table account


(account-number char(10),
branch-name char(15),
balance integer,
primary key (account-number),
foreign key (branch-name) references branch)

create table depositor


(customer-name char(20),
account-number char(10),
primary key (customer-name, account-number),
foreign key (account-number) references account,
foreign key (customer-name) references customer)

 Cascading Actions in SQL


create table account

(...
foreign key(branch-name) references branch
on delete cascade
on update cascade
...)

 Due to the on delete cascade clauses:


 if a delete of a tuple in branch results in referential-integrity constraint violation,
the delete “cascades” to the account relation
 this deletes the tuple that refers to the branch that was deleted
 Cascading updates are similar.
 Suppose:
 there is a chain of foreign-key dependencies across multiple relations
 with on delete cascade specified for each dependency
 then a deletion or update at one end of the chain can propagate across the entire
chain
 If a cascading update to delete causes a constraint violation that cannot be handled by a
further cascading operation:
 the system aborts the update
 all the changes caused by the update and its cascading actions are undone
 Transactions (covered later in course)
 Referential integrity is only checked at the end of a transaction
 Intermediate steps are allowed to violate referential integrity provided later steps
remove the violation
 Otherwise it would be impossible to create some database states, e.g. insert two
tuples whose foreign keys point to each other
• E.g. spouse attribute of relation
marriedperson(name, address, spouse)

 Alternative to cascading:
 on delete set null
 on delete set default

 Null values in foreign key attributes complicate SQL referential integrity semantics, and are
best prevented using not null
 if any attribute of a foreign key is null, the tuple is defined to satisfy the foreign key
constraint!

Assertions

 An assertion is a predicate expressing a condition that we wish the database always to


satisfy.
 An assertion in SQL takes the form
create assertion <assertion-name> check <predicate>

 When an assertion is made, the system tests it for validity, and tests it again on every update
that may violate the assertion
 This testing may introduce a significant amount of overhead; hence assertions
should be used with great care.
 Asserting
for all X, P(X)
is achieved indirectly using
not exists X such that not P(X)
 Assertions Examples
 The sum of all loan amounts for each branch must be less than the sum of all account
balances at the branch.
create assertion sum-constraint check
(not exists (select * from branch
where (select sum(amount) from loan
where loan.branch-name =
branch.branch-name)
>= (select sum(amount) from account
where loan.branch-name =
branch.branch-name)))

 Every loan has at least one borrower who maintains an account with a minimum balance of
$1000.00

create assertion balance-constraint check


(not exists (
select * from loan
where not exists (
select *
from borrower, depositor, account
where loan.loan-number = borrower.loan-number
and borrower.customer-name = depositor.customer-name
and depositor.account-number = account.account-number
and account.balance >= 1000 )))

Triggers

 A trigger is a statement that is executed automatically by the system as a side effect of a


modification to the database.
 To design a trigger mechanism, we must:
 Specify the conditions under which the trigger is to be executed.
 Specify the actions to be taken when the trigger executes.
 (Triggers introduced to SQL standard in SQL:1999, but supported even earlier using non-
standard syntax by most databases.)
 Trigger Example

 Suppose that instead of allowing negative account balances, the bank deals with overdrafts
by
 setting the account balance to zero
 creating a loan in the amount of the overdraft
 giving this loan a loan number identical to the account number of the overdrawn
account
 The condition for executing the trigger is an update to the account relation that results in a
negative balance value.

 Trigger Example in SQL

create trigger overdraft-trigger after update on account


referencing new row as n row for each row
when nrow.balance < 0
begin atomic
insert into borrower
(select customer-name, account-number
from depositor
where nrow.account-number =
depositor.account-number);
insert into loan values
(n.row.account-number, nrow.branch-name,
– nrow.balance);
update account set balance = 0
where account.account-number = nrow.account-number
end

 Triggering Events and Actions in SQL

 Triggering event can be insert, delete or update

 Triggers on update can be restricted to specific attributes


 E.g. create trigger overdraft-trigger after update of balance on account

 Values of attributes before and after an update can be referenced


 referencing old row as : for deletes and updates
 referencing new row as : for inserts and updates

 Triggers can be activated before an event, which can serve as extra constraints. E.g. convert
blanks to null.

create trigger setnull-trigger before update on r


referencing new row as nrow
for each row
when nrow.phone-number = ‘ ‘
set nrow.phone-number = null

 Statement Level Triggers

 Instead of executing a separate action for each affected row, a single action can be executed
for all rows affected by a transaction
 Use for each statement instead of for each row
 Use referencing old table or referencing new table to refer to temporary tables
(called transition tables) containing the affected rows
 Can be more efficient when dealing with SQL statements that update a large number
of rows
 External World Actions

 We sometimes require external world actions to be triggered on a database update


 E.g. re-ordering an item whose quantity in a warehouse has become small, or
turning on an alarm light,

 Triggers cannot be used to directly implement external-world actions, BUT


 Triggers can be used to record actions-to-be-taken in a separate table
 Have an external process that repeatedly scans the table, carries out external-
world actions and deletes action from table

 E.g. Suppose a warehouse has the following tables


 inventory(item, level)
 How much of each item is in the warehouse
 minlevel(item, level)
 The minimum desired level of each item
 reorder(item, amount)
 The quantity to re-order at a time
 orders(item, amount)
 Orders to be placed (read by external process)

create trigger reorder-trigger after update of amount on inventory

referencing old row as orow, new row as n row

for each row

when nrow.level < = (select level

from minlevel

where minlevel.item = orow.item)

and orow.level > (select level

from minlevel

where minlevel.item = orow.item)

begin

insert into orders

(select item, amount

from reorder

where reorder.item = orow.item)

end
 When Not To Use Triggers

 Triggers were used earlier for tasks such as


 maintaining summary data (e.g. total salary of each department)
 Replicating databases by recording changes to special relations (called change or
delta relations) and having a separate process that applies the changes over to a
replica
 There are better ways of doing these now:
 Databases today provide built in materialized view facilities to maintain summary
data
 Databases provide built-in support for replication
 Encapsulation facilities can be used instead of triggers in many cases
 Define methods to update fields
 Carry out actions as part of the update methods instead of
through a trigger

Authorization

 Forms of authorization on parts of the database:


 Read authorization - allows reading, but not modification of data.
 Insert authorization - allows insertion of new data, but not modification of existing
data.
 Update authorization - allows modification, but not deletion of data.
 Delete authorization - allows deletion of data

 Forms of authorization to modify the database schema:


 Index authorization - allows creation and deletion of indices.
 Resources authorization - allows creation of new relations.
 Alteration authorization - allows addition or deletion of attributes in a relation.
 Drop authorization - allows deletion of relations.

 Authorization and Views

 Users can be given authorization on views, without being given any authorization on the
relations used in the view definition
 Ability of views to hide data serves both to simplify usage of the system and to enhance
security by allowing users access only to data they need for their job
 A combination or relational-level security and view-level security can be used to limit a user’s
access to precisely the data that user needs.

 View Example

 Suppose a bank clerk needs to know the names of the customers of each branch, but is not
authorized to see specific loan information.
 Approach: Deny direct access to the loan relation, but grant access to the view cust-
loan, which consists only of the names of customers and the branches at which they
have a loan.
 The cust-loan view is defined in SQL as follows:
create view cust-loan as
select branchname, customer-name
from borrower, loan
where borrower.loan-number = loan.loan-number

 The clerk is authorized to see the result of the query:


select *
from cust-loan

 When the query processor translates the result into a query on the actual relations in the
database, we obtain a query on borrower and loan.
 Authorization must be checked on the clerk’s query before query processing replaces a view
by the definition of the view.

 Authorization on Views

 Creation of view does not require resources authorization since no real relation is being
created
 The creator of a view gets only those privileges that provide no additional authorization
beyond that he already had.
 E.g. if creator of view cust-loan had only read authorization on borrower and loan, he gets
only read authorization on cust-loan

Granting of Privileges

 The passage of authorization from one user to another may be represented by an


authorization graph.
 The nodes of this graph are the users.
 The root of the graph is the database administrator.
 Consider graph for update authorization on loan.
 An edge Ui Uj indicates that user Ui has granted update authorization on loan to Uj.

 Authorization Grant Graph


 Requirement: All edges in an authorization graph must be part of some path originating with
the database administrator
 If DBA revokes grant from U1:
 Grant must be revoked from U4 since U1 no longer has authorization
 Grant must not be revoked from U5 since U5 has another authorization path from DBA
through U2
 Must prevent cycles of grants with no path from the root:
 DBA grants authorization to U7
 U7 grants authorization to U8
 U8 grants authorization to U7
 DBA revokes authorization from U 7
 Must revoke grant U7 to U8 and from U8 to U7 since there is no path from DBA to U7 or to U8
anymore.

 Attempt to Defeat Authorization Revocation

 Authorization Graph

 Security Specification in SQL

 The grant statement is used to confer authorization


grant <privilege list>

on <relation name or view name> to <user list>

 <user list> is:


 a user-id
 public, which allows all valid users the privilege granted
 A role (more on this later)
 Granting a privilege on a view does not imply granting any privileges on the underlying
relations.
 The grantor of the privilege must already hold the privilege on the specified item (or be the
database administrator).

 Privileges in SQL

 select: allows read access to relation,or the ability to query using the view
 Example: grant users U1, U2, and U3 select authorization on the branch relation:
grant select on branch to U1, U2, and U3

 insert: the ability to insert tuples


 update: the ability to update using the SQL update statement
 delete: the ability to delete tuples.
 references: ability to declare foreign keys when creating relations.
 usage: In SQL-92; authorizes a user to use a specified domain
 all privileges: used as a short form for all the allowable privileges

 Privilege to Grant Privileges

 with grant option: allows a user who is granted a privilege to pass the privilege on to other
users.
 Example:
grant select on branch to U1 with grant option

gives U1 the select privileges on branch and allows U1 to grant this

privilege to others

 Roles
 Roles permit common privileges for a class of users can be specified just once by creating a
corresponding “role”
 Privileges can be granted to or revoked from roles, just like user
 Roles can be assigned to users, and even to other roles
 SQL:1999 supports roles
create role teller
create role manager

grant select on branch to teller


grant update (balance) on account to teller
grant all privileges on account to manager
grant teller to manager
grant teller to alice, bob
grant manager to avi

 Revoking Authorization in SQL


 The revoke statement is used to revoke authorization.
revoke<privilege list>

on <relation name or view name> from <user list> [restrict|cascade]

 Example:
revoke select on branch from U1, U2, U3 cascade

 Revocation of a privilege from a user may cause other users also to lose that privilege;
referred to as cascading of the revoke.
 We can prevent cascading by specifying restrict:
revoke select on branch from U1, U2, U3 restrict

With restrict, the revoke command fails if cascading revokes are required.

 <privilege-list> may be all to revoke all privileges the revokee may hold.
 If <revokee-list> includes public all users lose the privilege except those granted it explicitly.
 If the same privilege was granted twice to the same user by different grantees, the user may
retain the privilege after the revocation.
 All privileges that depend on the privilege being revoked are also revoked.
 Limitations of SQL Authorization

 SQL does not support authorization at a tuple level


 E.g. we cannot restrict students to see only (the tuples storing) their own grades
 With the growth in Web access to databases, database accesses come primarily from
application servers.
 End users don't have database user ids, they are all mapped to the same database
user id
 All end-users of an application (such as a web application) may be mapped to a single database
user
 The task of authorization in above cases falls on the application program, with no support
from SQL
 Benefit: fine grained authorizations, such as to individual tuples, can be implemented
by the application.
 Drawback: Authorization must be done in application code, and may be dispersed all
over an application
 Checking for absence of authorization loopholes becomes very difficult since it requires
reading large amounts of application code
______

You might also like