0% found this document useful (0 votes)
76 views

DBMS Unit-4 & 5

The document discusses transaction processing concepts in a database management system (DBMS). It defines a transaction as a set of logically related operations that must uphold the ACID properties of atomicity, consistency, isolation, and durability. It describes transaction operations like read and write, and explains concepts like serializability, schedules, and recoverability which ensure transactions execute reliably and consistently in the DBMS.

Uploaded by

harshitpanwaar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

DBMS Unit-4 & 5

The document discusses transaction processing concepts in a database management system (DBMS). It defines a transaction as a set of logically related operations that must uphold the ACID properties of atomicity, consistency, isolation, and durability. It describes transaction operations like read and write, and explains concepts like serializability, schedules, and recoverability which ensure transactions execute reliably and consistently in the DBMS.

Uploaded by

harshitpanwaar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 93

Transaction Processing

Concepts
Transactions

 The transaction is a set of logically related operation. It contains a group of


tasks.
 A transaction is an action or series of actions. It is performed by a single user
to perform operations for accessing the contents of the database.
 We can define a transaction as a group of tasks in DBMS. Here a single task refers to a
minimum processing unit, and we cannot divide it further. Now let us take the example
of a certain simple transaction. Suppose any worker transfers Rs 1000 from X’s account
to Y’s account. This given small and simple transaction involves various low-level
tasks.
Operations of Transaction:
 Following are the main operations of transaction:
 Read(X): Read operation is used to read the value of X from the database and
stores it in a buffer in main memory.
 Write(X): Write operation is used to write the value back to the database from
the buffer.
 Let's take an example to debit transaction from an account which consists of
following operations:
1. 1. R(X);
2. 2. X = X - 500;
3. 3. W(X);
ACID Properties

 The transaction refers to a


small unit of any given program
that consists of various low-
level tasks. Every transaction in
DBMS must maintain ACID – A
(Atomicity), C (Consistency), I
(Isolation), D (Durability). One
must maintain ACID so as to
ensure completeness, accuracy,
and integrity of data.
Atomicity
• It states that all operations of the transaction take place at once if not,
the transaction is aborted.
• There is no midway, i.e., the transaction cannot occur partially. Each
transaction is treated as one unit and either run to completion or is not
executed at all.
 Atomicity involves the following two operations:
 Abort: If a transaction aborts then all the changes made are not visible.
 Commit: If a transaction commits then all the changes made are
visible.
 Example: Let's assume that following transaction T consisting of T1
and T2. A consists of Rs 600 and B consists of Rs 300. Transfer Rs 100
from account A to account B.
 After completion of the transaction, A consists of Rs 500 and B consists of Rs 400.

 If the transaction T fails after the completion of transaction T1 but before


completion of transaction T2, then the amount will be deducted from A but not
added to B. This shows the inconsistent database state. In order to ensure
correctness of database state, the transaction must be executed in entirety.
Consistency
• The integrity constraints are maintained so that the database is consistent before and after
the transaction.
• The execution of a transaction will leave a database in either its prior stable state or a new
stable state.
• The consistent property of database states that every transaction sees a consistent database
instance.
• The transaction is used to transform the database from one consistent state to another
consistent state.
 For example: The total amount must be maintained before or after the transaction.
1. Total before T occurs = 600+300=900
2. Total after T occurs= 500+400=900
 Therefore, the database is consistent. In the case when T1 is completed but T2 fails, then
inconsistency will occur.
Isolation

 It shows that the data which is used at the time of execution of a transaction
cannot be used by the second transaction until the first one is completed.
 In isolation, if the transaction T1 is being executed and using the data item X,
then that data item can't be accessed by any other transaction T2 until the
transaction T1 ends.
 The concurrency control subsystem of the DBMS enforced the isolation
property.
Durability

 The durability property is used to indicate the performance of the database's


consistent state. It states that the transaction made the permanent changes.
 They cannot be lost by the erroneous operation of a faulty transaction or by
the system failure. When a transaction is completed, then the database
reaches a state known as the consistent state. That consistent state cannot
be lost, even in the event of a system's failure.
 The recovery subsystem of the DBMS has the responsibility of Durability
property.
States of Transaction
Schedule

 Schedule, as the name suggests, is


a process of lining the transactions
and executing them one by one.
When there are multiple
transactions that are running in a
concurrent manner and the order
of operation is needed to be set so
that the operations do not overlap
each other, Scheduling is brought
into play and the transactions are
timed accordingly.
Serial Schedules

 Schedules in which the transactions are executed


non-interleaved, i.e., a serial schedule is one in
which no transaction starts until a running transaction
has ended are called serial schedules.
 Example: Consider the following schedule involving
two transactions T1 and T2.
 where R(A) denotes that a read operation is
performed on some data item ‘A’
 This is a serial schedule since the transactions
perform serially in the order T1 —> T2
Non-Serial Schedule:
 This is a type of Scheduling where the operations of multiple transactions are
interleaved. This might lead to a rise in the concurrency problem. The
transactions are executed in a non-serial manner, keeping the end result
correct and same as the serial schedule.
 Unlike the serial schedule where one transaction must wait for another to
complete all its operation, in the non-serial schedule, the other transaction
proceeds without waiting for the previous transaction to complete.
Serializable:

 This is used to maintain the consistency of the database. It is mainly used in


the Non-Serial scheduling to verify whether the scheduling will lead to any
inconsistency or not.
 On the other hand, a serial schedule does not need the serializability because
it follows a transaction only when the previous transaction is complete. The
non-serial schedule is said to be in a serializable schedule only when it is
equivalent to the serial schedules, for an n number of transactions.
 Since concurrency is allowed in this case thus, multiple transactions can
execute concurrently. A serializable schedule helps in improving both resource
utilization and CPU throughput.
 Two types are there
 Conflict and view serializable
Conflict Serializable

 A schedule is called conflict serializable if it can be transformed into a serial


schedule by swapping non-conflicting operations.

 Conflicting operations: Two operations are said to be conflicting if all


conditions satisfy:

 They belong to different transactions


 They operate on the same data item
 At Least one of them is a write operation
Conflict Serializable
Practice questions
Practice questions
View Serializable

 View serializability is a concept that is used to compute whether schedules


are View-Serializable or not. A schedule is said to be View-Serializable if it is
view equivalent to a Serial Schedule (where no interleaving of transactions is
possible).
Need of View-Serializability

 There may be some schedules that are not Conflict-Serializable but still
gives a consistent result because the concept of Conflict-Serializability
becomes limited when the Precedence Graph of a schedule contains
a loop/cycle.
 In such a case we cannot predict whether a schedule would be consistent
or inconsistent. As per the concept of Conflict-Serializability, We can say
that a schedule is Conflict-Serializable (means serial and consistent) if its
corresponding precedence graph does not have any loop/cycle.
 But, what if a schedule’s precedence graph contains a cycle/loop and is
giving consistent result/accurate results as a conflict serializable
schedule is giving?
Method to Check the View-Serializability
of a Schedule
 A schedule S1 is said to be view-equivalent to a schedule S2 if and only if:
 The order of any two conflicting operations in S1 is the same as the order of
those operations in S2. A conflicting operation is an operation that accesses
the same data item as another operation and at least one of the operations is
a write operation.
 The order of any two non-conflicting operations can be interchanged without
changing the results produced by the schedules.
 In other words, two schedules are view-equivalent if they produce the same
results regardless of the order in which non-conflicting operations are
executed, and the order of conflicting operations is the same in both
schedules.
Process of testing view serializability

 First check, if the given schedule is conflict serializable or not. If it is conflict


serializable then it will be view serializable for sure.
 If the given schedule is not conflict serializable then further check for other
conditions.
 Check for the blind writes, if there is no blind write then the schedule is not
view serializable.
 If there is any blind write then check for the conditions of view serializability
Recoverability

 Recoverability is a property of database systems that ensures that, in the


event of a failure or error, the system can recover the database to a
consistent state. Recoverability guarantees that all committed transactions
are durable and that their effects are permanently stored in the database,
while the effects of uncommitted transactions are undone to maintain data
consistency.
 The recoverability property is enforced through the use of transaction logs,
which record all changes made to the database during transaction processing.
When a failure occurs, the system uses the log to recover the database to a
consistent state, which involves either undoing the effects of uncommitted
transactions or redoing the effects of committed transactions.
Recoverable Schedules:

 If some transaction Tj is reading value updated or written by


some other transaction Ti, then the commit of Tj must occur after
the commit of Ti.
 Consider the following schedule involving two transactions T1 and T2.
Irrecoverable Schedule:

 The table below shows a schedule


with two transactions, T1 reads and
writes A and that value is read and
written by T2. T2 commits. But later
on, T1 fails. So we have to rollback
T1. Since T2 has read the value
written by T1, it should also be
rollbacked.
 But we have already committed
that. So this schedule is
irrecoverable schedule. When Tj is
reading the value updated by Ti and
Tj is committed before committing
of Ti, the schedule will be
irrecoverable.
Recoverable with Cascading Rollback:

 The table below shows a schedule with


two transactions, T1 reads and writes A
and that value is read and written by T2.
But later on, T1 fails. So we have to
rollback T1. Since T2 has read the value
written by T1, it should also be
rollbacked.
 As it has not committed, we can rollback
T2 as well. So it is recoverable with
cascading rollback. Therefore, if Tj is
reading value updated by Ti and commit
of Tj is delayed till commit of Ti, the
schedule is called recoverable with
cascading rollback.
Cascadeless Recoverable Rollback:

 The table below shows a


schedule with two transactions,
T1 reads and writes A and
commits and that value is read
by T2. But if T1 fails before
commit, no other transaction has
read its value, so there is no
need to rollback other
transaction.
 So this is a Cascadeless
recoverable schedule. So, if Tj
reads value updated by Ti only
after Ti is committed, the
schedule will be cascadeless
recoverable.
Log based recovery

 The log is a sequence of log records, recording all the update activities in
the database. In stable storage, logs for each transaction are maintained.
Any operation which is performed on the database is recorded is on the
log. Prior to performing any modification to the database, an update log
record is created to reflect that modification. An update log record
represented as: <Ti, Xj, V1, V2> has these fields:
1. Transaction identifier: Unique Identifier of the transaction that
performed the write operation.
2. Data item: Unique identifier of the data item written.
3. Old value: Value of data item prior to write.
4. New value: Value of data item after write operation.
 Other types of log records are:
1. <Ti start>: It contains information about when a transaction Ti starts.
2. <Ti commit>: It contains information about when a transaction Ti commits.
3. <Ti abort>: It contains information about when a transaction Ti aborts.
 Undo and Redo Operations – Because all database modifications must be preceded
by creation of log record, the system has available both the old value prior to the
modification of the data item and new value that is to be written for data item. This
allows system to perform redo and undo operations as appropriate:
1. Undo: using a log record sets the data item specified in log record to old value.
2. Redo: using a log record sets the data item specified in log record to new value.
 There are two approaches to modify the database:
 1. Deferred database modification:
• The deferred modification technique occurs if the transaction does not modify the
database until it has committed.
• In this method, all the logs are created and stored in the stable storage, and the database is
updated when a transaction commits.
 2. Immediate database modification:
• The Immediate modification technique occurs if database modification occurs while the
transaction is still active.
• In this technique, the database is modified immediately after every operation. It follows an
actual database modification.
Recovery using Log records

 After a system crash has occurred, the system consults the log to determine
which transactions need to be redone and which need to be undone.

 Transaction Ti needs to be undone if the log contains the record <Ti start> but
does not contain either the record <Ti commit> or the record <Ti abort>.
 Transaction Ti needs to be redone if log contains record <Ti start> and either
the record <Ti commit> or the record <Ti abort>.
Checkpoint

 The checkpoint is a type of mechanism where all the previous logs are
removed from the system and permanently stored in the storage disk.
 The checkpoint is like a bookmark. While the execution of the transaction,
such checkpoints are marked, and the transaction is executed then using the
steps of the transaction, the log files will be created.
 When it reaches to the checkpoint, then the transaction will be updated into
the database, and till that point, the entire log file will be removed from the
file. Then the log file is updated with the new step of transaction till next
checkpoint and so on.
 The checkpoint is used to declare a point before which the DBMS was in the
consistent state, and all transactions were committed.
Recovery using Checkpoint
•The recovery system reads log files from the end
to start. It reads log files from T4 to T1.
•Recovery system maintains two lists, a redo-list,
and an undo-list.
•The transaction is put into redo state if the
recovery system sees a log with <Tn, Start> and
<Tn, Commit> or just <Tn, Commit>. In the redo-
list and their previous list, all the transactions are
removed and then redone before saving their logs.
Deadlock in DBMS

 A deadlock is a condition where two or more transactions are waiting indefinitely for
one another to give up locks. Deadlock is said to be one of the most feared
complications in DBMS as no task ever gets finished and is in waiting state forever.
 For example: In the student table, transaction T1 holds a lock on some rows and needs
to update some rows in the grade table. Simultaneously, transaction T2 holds locks on
some rows in the grade table and needs to update the rows in the Student table held by
Transaction T1.
 Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock
and similarly, transaction T2 is waiting for T1 to release its lock. All activities come to
a halt state and remain at a standstill. It will remain in a standstill until the DBMS
detects the deadlock and aborts one of the transactions.
Deadlock Detection
 In a database, when a transaction waits indefinitely to obtain
a lock, then the DBMS should detect whether the transaction
is involved in a deadlock or not. The lock manager maintains
a Wait for the graph to detect the deadlock cycle in the
database.
 Wait for Graph
• This is the suitable method for deadlock detection. In this
method, a graph is created based on the transaction and their
lock. If the created graph has a cycle or closed loop, then
there is a deadlock.
• The wait for the graph is maintained by the system for every
transaction which is waiting for some data held by the others.
The system keeps checking the graph if there is any cycle in
the graph.
 The wait for graph for the above scenario is shown below:
Deadlock Prevention

 Deadlock prevention method is suitable for a large database. If the resources


are allocated in such a way that deadlock never occurs, then the deadlock
can be prevented.
 The Database management system analyzes the operations of the transaction
whether they can create a deadlock situation or not. If they do, then the
DBMS never allowed that transaction to be executed.
Wound wait scheme

 In wound wait scheme, if the older transaction requests for a resource which
is held by the younger transaction, then older transaction forces younger one
to kill the transaction and release the resource. After the minute delay, the
younger transaction is restarted but with the same timestamp.
 If the older transaction has held a resource which is requested by the Younger
transaction, then the younger transaction is asked to wait until older releases
it.
DBMS Concurrency Control
 Concurrency Control is the management procedure that is required for
controlling concurrent execution of the operations that take place on a
database.
 In a multi-user system, multiple users can access and use the same database
at one time, which is known as the concurrent execution of the database. It
means that the same database is executed simultaneously on a multi-user
system by different users.
 While working on the database transactions, there occurs the requirement of
using the database by multiple users for performing different operations, and
in that case, concurrent execution of the database is performed.
 The thing is that the simultaneous execution that is performed should be done
in an interleaved manner, and no operation should affect the other executing
operations, thus maintaining the consistency of the database. Thus, on
making the concurrent execution of the transaction operations, there occur
several challenging problems that need to be solved.
Problems with Concurrent Execution

 In a database transaction, the two main operations are READ and WRITE
operations. So, there is a need to manage these two operations in the
concurrent execution of the transactions as if these operations are not
performed in an interleaved manner, and the data may become inconsistent.
So, the following problems occur with the Concurrent Execution of the
operations:
 Problem 1: Lost Update Problems (W - W Conflict)
 The problem occurs when two different database transactions perform the
read/write operations on the same database items in an interleaved manner
(i.e., concurrent execution) that makes the values of the items incorrect hence
making the database inconsistent.
 The concurrency control protocols ensure the atomicity, consistency,
isolation, durability and serializability of the concurrent execution of the
database transactions. Therefore, these protocols are categorized as:

 Lock Based Concurrency Control Protocol


 Time Stamp Concurrency Control Protocol
 Validation Based Concurrency Control Protocol
Two phase locking protocol
 A lock is a variable associated with a data item that describes a status of data item with
respect to possible operation that can be applied to it. They synchronize the access by
concurrent transactions to the database items. It is required in this protocol that all the
data items must be accessed in a mutually exclusive manner. Let me introduce you to
two common locks which are used and some terminology followed in this protocol.
1. Shared Lock (S): also known as Read-only lock. As the name suggests it can be
shared between transactions because while holding this lock the transaction does not
have the permission to update data on the data item. S-lock is requested using lock-S
instruction.
2. Exclusive Lock (X): Data item can be both read as well as written. This is Exclusive
and cannot be held simultaneously on the same data item. X-lock is requested using
lock-X instruction.
 A transaction may be granted a lock on an item if the requested lock is
compatible with locks already held on the item by other transactions.
 Any number of transactions can hold shared locks on an item, but if any
transaction holds an exclusive(X) on the item no other transaction may hold
any lock on the item.
 If a lock cannot be granted, the requesting transaction is made to wait till all
incompatible locks held by other transactions have been released. Then the
lock is granted.
 Upgrade / Downgrade locks : A transaction that holds a lock on an item A is
allowed under certain condition to change the lock state from one state to
another.
 Upgrade: A S(A) can be upgraded to X(A) if Ti is the only transaction holding
the S-lock on element A.
 Downgrade: We may downgrade X(A) to S(A) when we feel that we no longer
want to write on data-item A. As we were holding X-lock on A, we need not
check any conditions.
Two-Phase Locking –

 A transaction is said to follow the Two-Phase Locking protocol if


Locking and Unlocking can be done in two phases.
1. Growing Phase: New locks on data items may be acquired but
none can be released.
2. Shrinking Phase: Existing locks may be released but no new
locks can be acquired.
 Note – If lock conversion is allowed, then upgrading of lock( from
S(a) to X(a) ) is allowed in the Growing Phase, and downgrading
of lock (from X(a) to S(a)) must be done in shrinking phase.
 In the growing phase transaction
reaches a point where all the
locks it may need has been
acquired. This point is called
LOCK POINT.
 After the lock point has been
reached, the transaction enters a
shrinking phase.
 2-PL ensures serializability, but there are still some drawbacks of
2-PL. Let’s glance at the drawbacks:
• Cascading Rollback is possible under 2-PL.
• Deadlocks and Starvation are possible.
Cascading Rollbacks in 2-PL –

 Because of Dirty
Read in T2 and T3 in
lines 8 and 12
respectively, when
T1 failed we have to
roll back others also.
Hence, Cascading
Rollbacks are
possible in 2-PL.
Deadlock in 2-PL –

 Consider this simple example, it will be easy to understand. Say we have two
transactions T1 and T2.

 Schedule: Lock-X1(A) Lock-X2(B) Lock-X1(B) Lock-X2(A)


Strict 2-PL –

 The 2PL protocol gradually obtains locks and then gradually releases them when they’re
no longer needed. The difference between the basic 2PL protocol and strict 2PL is that
strict 2PL releases the lock immediately after the commit command executes. Instead of
gradually releasing locks one by one, the strict 2PL protocol releases them at once.
 Following Strict 2-PL ensures that our schedule is:
 Recoverable
 Cascadeless
 Hence, it gives us freedom from Cascading Abort which was still there in Basic 2-PL and
moreover guarantee Strict Schedules but still, Deadlocks are possible!
Rigorous 2-PL –
 This requires that in addition to the lock being 2-Phase all Exclusive(X)
and Shared(S) locks held by the transaction be released until after the
Transaction Commits. Following Rigorous 2-PL ensures that our schedule
is:
 Recoverable
 Cascadeless
 Hence, it gives us freedom from Cascading Abort which was still there in
Basic 2-PL and moreover guarantee Strict Schedules but still, Deadlocks
are possible!
 Note: The difference between Strict 2-PL and Rigorous 2-PL is that
Rigorous is more restrictive, it requires both Exclusive and Shared locks
to be held until after the Transaction commits and this is what makes the
implementation of Rigorous 2-PL easier.
Conservative 2-PL –

 this protocol requires the transaction to lock all the items it access before the
Transaction begins execution by predeclaring its read-set and write-set. If any
of the predeclared items needed cannot be locked, the transaction does not
lock any of the items, instead, it waits until all the items are available for
locking.
 Conservative 2-PL is Deadlock free and but it does not ensure a Strict
schedule. However, it is difficult to use in practice because of the need to
predeclare the read-set and the write-set which is not possible in many
situations. In practice, the most popular variation of 2-PL is Strict 2-PL.
Timestamp Ordering Protocol

• The Timestamp Ordering Protocol is used to order the transactions based on their
Timestamps. The order of transaction is nothing but the ascending order of the
transaction creation.
• The priority of the older transaction is higher that's why it executes first. To determine
the timestamp of the transaction, this protocol uses system time or logical counter.
• The lock-based protocol is used to manage the order between conflicting pairs among
transactions at the execution time. But Timestamp based protocols start working as
soon as a transaction is created.
• Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has
entered the system at 007 times and transaction T2 has entered the system at 009
times. T1 has the higher priority, so it executes first as it is entered the system first.
• The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write'
operation on a data.
Thomas write Rule

 Thomas Write Rule provides the guarantee of serializability order for the protocol. It
improves the Basic Timestamp Ordering Algorithm.
 The basic Thomas write rules are as follows:
• If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and operation is
rejected.
• If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the transaction
and continue processing.
• If neither condition 1 nor condition 2 occurs, then allowed to execute the WRITE
operation by transaction Ti and set W_TS(X) to TS(T).
 If we use the Thomas write rule then some serializable schedule can be permitted that
does not conflict serializable as illustrate by the schedule in a given figure:
Validation Based Protocol

 Validation phase is also known as optimistic concurrency control technique. In


the validation based protocol, the transaction is executed in the following three
phases:
1. Read phase: In this phase, the transaction T is read and executed. It is used to
read the value of various data items and stores them in temporary local
variables. It can perform all the write operations on temporary variables
without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be validated
against the actual data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the
temporary results are written to the database or system otherwise the
transaction is rolled back.
Multiple Granularity

You might also like