DBMS Unit-4 & 5
DBMS Unit-4 & 5
Concepts
Transactions
It shows that the data which is used at the time of execution of a transaction
cannot be used by the second transaction until the first one is completed.
In isolation, if the transaction T1 is being executed and using the data item X,
then that data item can't be accessed by any other transaction T2 until the
transaction T1 ends.
The concurrency control subsystem of the DBMS enforced the isolation
property.
Durability
There may be some schedules that are not Conflict-Serializable but still
gives a consistent result because the concept of Conflict-Serializability
becomes limited when the Precedence Graph of a schedule contains
a loop/cycle.
In such a case we cannot predict whether a schedule would be consistent
or inconsistent. As per the concept of Conflict-Serializability, We can say
that a schedule is Conflict-Serializable (means serial and consistent) if its
corresponding precedence graph does not have any loop/cycle.
But, what if a schedule’s precedence graph contains a cycle/loop and is
giving consistent result/accurate results as a conflict serializable
schedule is giving?
Method to Check the View-Serializability
of a Schedule
A schedule S1 is said to be view-equivalent to a schedule S2 if and only if:
The order of any two conflicting operations in S1 is the same as the order of
those operations in S2. A conflicting operation is an operation that accesses
the same data item as another operation and at least one of the operations is
a write operation.
The order of any two non-conflicting operations can be interchanged without
changing the results produced by the schedules.
In other words, two schedules are view-equivalent if they produce the same
results regardless of the order in which non-conflicting operations are
executed, and the order of conflicting operations is the same in both
schedules.
Process of testing view serializability
The log is a sequence of log records, recording all the update activities in
the database. In stable storage, logs for each transaction are maintained.
Any operation which is performed on the database is recorded is on the
log. Prior to performing any modification to the database, an update log
record is created to reflect that modification. An update log record
represented as: <Ti, Xj, V1, V2> has these fields:
1. Transaction identifier: Unique Identifier of the transaction that
performed the write operation.
2. Data item: Unique identifier of the data item written.
3. Old value: Value of data item prior to write.
4. New value: Value of data item after write operation.
Other types of log records are:
1. <Ti start>: It contains information about when a transaction Ti starts.
2. <Ti commit>: It contains information about when a transaction Ti commits.
3. <Ti abort>: It contains information about when a transaction Ti aborts.
Undo and Redo Operations – Because all database modifications must be preceded
by creation of log record, the system has available both the old value prior to the
modification of the data item and new value that is to be written for data item. This
allows system to perform redo and undo operations as appropriate:
1. Undo: using a log record sets the data item specified in log record to old value.
2. Redo: using a log record sets the data item specified in log record to new value.
There are two approaches to modify the database:
1. Deferred database modification:
• The deferred modification technique occurs if the transaction does not modify the
database until it has committed.
• In this method, all the logs are created and stored in the stable storage, and the database is
updated when a transaction commits.
2. Immediate database modification:
• The Immediate modification technique occurs if database modification occurs while the
transaction is still active.
• In this technique, the database is modified immediately after every operation. It follows an
actual database modification.
Recovery using Log records
After a system crash has occurred, the system consults the log to determine
which transactions need to be redone and which need to be undone.
Transaction Ti needs to be undone if the log contains the record <Ti start> but
does not contain either the record <Ti commit> or the record <Ti abort>.
Transaction Ti needs to be redone if log contains record <Ti start> and either
the record <Ti commit> or the record <Ti abort>.
Checkpoint
The checkpoint is a type of mechanism where all the previous logs are
removed from the system and permanently stored in the storage disk.
The checkpoint is like a bookmark. While the execution of the transaction,
such checkpoints are marked, and the transaction is executed then using the
steps of the transaction, the log files will be created.
When it reaches to the checkpoint, then the transaction will be updated into
the database, and till that point, the entire log file will be removed from the
file. Then the log file is updated with the new step of transaction till next
checkpoint and so on.
The checkpoint is used to declare a point before which the DBMS was in the
consistent state, and all transactions were committed.
Recovery using Checkpoint
•The recovery system reads log files from the end
to start. It reads log files from T4 to T1.
•Recovery system maintains two lists, a redo-list,
and an undo-list.
•The transaction is put into redo state if the
recovery system sees a log with <Tn, Start> and
<Tn, Commit> or just <Tn, Commit>. In the redo-
list and their previous list, all the transactions are
removed and then redone before saving their logs.
Deadlock in DBMS
A deadlock is a condition where two or more transactions are waiting indefinitely for
one another to give up locks. Deadlock is said to be one of the most feared
complications in DBMS as no task ever gets finished and is in waiting state forever.
For example: In the student table, transaction T1 holds a lock on some rows and needs
to update some rows in the grade table. Simultaneously, transaction T2 holds locks on
some rows in the grade table and needs to update the rows in the Student table held by
Transaction T1.
Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock
and similarly, transaction T2 is waiting for T1 to release its lock. All activities come to
a halt state and remain at a standstill. It will remain in a standstill until the DBMS
detects the deadlock and aborts one of the transactions.
Deadlock Detection
In a database, when a transaction waits indefinitely to obtain
a lock, then the DBMS should detect whether the transaction
is involved in a deadlock or not. The lock manager maintains
a Wait for the graph to detect the deadlock cycle in the
database.
Wait for Graph
• This is the suitable method for deadlock detection. In this
method, a graph is created based on the transaction and their
lock. If the created graph has a cycle or closed loop, then
there is a deadlock.
• The wait for the graph is maintained by the system for every
transaction which is waiting for some data held by the others.
The system keeps checking the graph if there is any cycle in
the graph.
The wait for graph for the above scenario is shown below:
Deadlock Prevention
In wound wait scheme, if the older transaction requests for a resource which
is held by the younger transaction, then older transaction forces younger one
to kill the transaction and release the resource. After the minute delay, the
younger transaction is restarted but with the same timestamp.
If the older transaction has held a resource which is requested by the Younger
transaction, then the younger transaction is asked to wait until older releases
it.
DBMS Concurrency Control
Concurrency Control is the management procedure that is required for
controlling concurrent execution of the operations that take place on a
database.
In a multi-user system, multiple users can access and use the same database
at one time, which is known as the concurrent execution of the database. It
means that the same database is executed simultaneously on a multi-user
system by different users.
While working on the database transactions, there occurs the requirement of
using the database by multiple users for performing different operations, and
in that case, concurrent execution of the database is performed.
The thing is that the simultaneous execution that is performed should be done
in an interleaved manner, and no operation should affect the other executing
operations, thus maintaining the consistency of the database. Thus, on
making the concurrent execution of the transaction operations, there occur
several challenging problems that need to be solved.
Problems with Concurrent Execution
In a database transaction, the two main operations are READ and WRITE
operations. So, there is a need to manage these two operations in the
concurrent execution of the transactions as if these operations are not
performed in an interleaved manner, and the data may become inconsistent.
So, the following problems occur with the Concurrent Execution of the
operations:
Problem 1: Lost Update Problems (W - W Conflict)
The problem occurs when two different database transactions perform the
read/write operations on the same database items in an interleaved manner
(i.e., concurrent execution) that makes the values of the items incorrect hence
making the database inconsistent.
The concurrency control protocols ensure the atomicity, consistency,
isolation, durability and serializability of the concurrent execution of the
database transactions. Therefore, these protocols are categorized as:
Because of Dirty
Read in T2 and T3 in
lines 8 and 12
respectively, when
T1 failed we have to
roll back others also.
Hence, Cascading
Rollbacks are
possible in 2-PL.
Deadlock in 2-PL –
Consider this simple example, it will be easy to understand. Say we have two
transactions T1 and T2.
The 2PL protocol gradually obtains locks and then gradually releases them when they’re
no longer needed. The difference between the basic 2PL protocol and strict 2PL is that
strict 2PL releases the lock immediately after the commit command executes. Instead of
gradually releasing locks one by one, the strict 2PL protocol releases them at once.
Following Strict 2-PL ensures that our schedule is:
Recoverable
Cascadeless
Hence, it gives us freedom from Cascading Abort which was still there in Basic 2-PL and
moreover guarantee Strict Schedules but still, Deadlocks are possible!
Rigorous 2-PL –
This requires that in addition to the lock being 2-Phase all Exclusive(X)
and Shared(S) locks held by the transaction be released until after the
Transaction Commits. Following Rigorous 2-PL ensures that our schedule
is:
Recoverable
Cascadeless
Hence, it gives us freedom from Cascading Abort which was still there in
Basic 2-PL and moreover guarantee Strict Schedules but still, Deadlocks
are possible!
Note: The difference between Strict 2-PL and Rigorous 2-PL is that
Rigorous is more restrictive, it requires both Exclusive and Shared locks
to be held until after the Transaction commits and this is what makes the
implementation of Rigorous 2-PL easier.
Conservative 2-PL –
this protocol requires the transaction to lock all the items it access before the
Transaction begins execution by predeclaring its read-set and write-set. If any
of the predeclared items needed cannot be locked, the transaction does not
lock any of the items, instead, it waits until all the items are available for
locking.
Conservative 2-PL is Deadlock free and but it does not ensure a Strict
schedule. However, it is difficult to use in practice because of the need to
predeclare the read-set and the write-set which is not possible in many
situations. In practice, the most popular variation of 2-PL is Strict 2-PL.
Timestamp Ordering Protocol
• The Timestamp Ordering Protocol is used to order the transactions based on their
Timestamps. The order of transaction is nothing but the ascending order of the
transaction creation.
• The priority of the older transaction is higher that's why it executes first. To determine
the timestamp of the transaction, this protocol uses system time or logical counter.
• The lock-based protocol is used to manage the order between conflicting pairs among
transactions at the execution time. But Timestamp based protocols start working as
soon as a transaction is created.
• Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has
entered the system at 007 times and transaction T2 has entered the system at 009
times. T1 has the higher priority, so it executes first as it is entered the system first.
• The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write'
operation on a data.
Thomas write Rule
Thomas Write Rule provides the guarantee of serializability order for the protocol. It
improves the Basic Timestamp Ordering Algorithm.
The basic Thomas write rules are as follows:
• If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and operation is
rejected.
• If TS(T) < W_TS(X) then don't execute the W_item(X) operation of the transaction
and continue processing.
• If neither condition 1 nor condition 2 occurs, then allowed to execute the WRITE
operation by transaction Ti and set W_TS(X) to TS(T).
If we use the Thomas write rule then some serializable schedule can be permitted that
does not conflict serializable as illustrate by the schedule in a given figure:
Validation Based Protocol