Slides

S1: Ch#18: Concurrency Control (transaction schedulers)

Learning Goals:
- Compare concurrency control in general
- ...and those to ensure ACID properties
- Understand data base solutions
- ...Locking, Time-Stamping, Multiversion, Certification
- Additional Issues:
- ...Item Granularity, Phantom Records, Interactive Transactions
Chapter 18: Concurrency Control Techniques (CCTs) + ?Serializabe
- 18.1 2PL (lock types, basic/2PL/strict, deadlocks)
- 18.2 Timestamp Ordering (timestamps, ordering algo)
- 18.3 Multiversion CCT (w time-stamps, 2-phase)
- 18.4 Validation (Optimistic) basec CCT
- 18.5 Granularity of Data Item
- 18.6 CCT for Indexes : B-link trees
- 18.7 Other Issues (phantom records, interactive transactions)

S2: Concurrency Control

Def.: Protocols, Algorithms, Techniques to avoid problems
- ...resulting from concurrent operations on shared data
- Problems of lost update, incorrect summary, phantom records, ...
A Well-known problem in O.S., Client-Server, Shared Resource, ...
Ex. Concurrent processes/threads in Operating Systems
- Mutual Exclusion Protocols: semaphores, critical sections, ...
Q.1.? What is different about CC of transactions?
Q.2.? Consider 3 problems in Fig. 18.3 (pp. 589)?
- 2a.? Will semaphores solve these problems ?
- 2b.? Will locks solve these problems ?

S3: A Naive Concurrency Control Technique

CCT 0: Conflict Graph Scheduler (Using Alg. 17.1)
- Input: a schedule of actions from concurrent Ts in natural order
- Output: a serializable schedule of same set of actions
- ...Output is a permutation of input
- Maintain a conflict graph among active transactions,
- ...Update CG before every action in the input schedule
- ...If the next action in input schedule leads to a cycle,
- ...then postpone action or abort Ti to avoid cycle.
- ...Add implementation detail to schedule postponed actions
- Cycle checking is O(n*n), n = number of actions,
- ...and n*n = max. #conflicts.
Q.3.? Why not use Conflict Graph Scheduler w/Alg. 17.1?

S4: Hints for Exercises

Ans. 1: ACID property, serializability
Ans. 2: 2 decisions: number of semaphore, critical section (CS)
- A. 1 semaphore, CS = whole transaction
- ...= > serial schedules, very little concurrency
- B. 1 semaphore/data-item, CS = all read-write per data-item
- ...= > temporary update problem (Fig. 17.3(b)
- ...Serializability not guaranteed. (analogy Fig. 18.3 (c))
Recall ACID properties, Operation Conflict Semantics, Commit/Abort
- Failure atomicity = > tentative actions,
- ...e.g. Ti aborts = > Tj may abort too!
- Isolation: Ti should only see committed wj(X)
- Concurrent reads okay, but write can't overlap with other ops.
Ans. 2b. Active learning exercise.
Ans. 3: Alg. 19.1 is not fast enough as discussed before
- Expensive, O(n*n) cost before each operation, n = #active ops
- Arrival/Completion of a Xaction/operation during the algorithms

S5: Common Concurrency Control Techniques (CCTs)

General Idea behind various CCTs
- Goal: Quickly ensure serializability
- Fast filter to avoid violation of serializability
- ...may rule many possible serializable schedule
- Prescribes a Protocol to be observed by each transaction
- ... < current state, new op. request > - > grant/delay op, abort T
Simple Techniques
- 2 phase locking (2PL) protocol for transactions
- ...Schedule: order Ti by the order of lock acquisition
- TSO: item-based violation test at read/write
- ...Schedule: order Ti by their time-stamps
Other schemes
- Optimistic, 3 phases (read, validate, write)
- ...Validate phase = conflict? w/committed/validated Tj
- Multiversion - uses version semantics

S6: Lock Based CCT : Basic Definitions

LOCK[X] = a variable describing status of data item X
- ...wrt possible applicable operations without conflict.
Lock(X, op)= permission to Ti to perform op on X
- NOTE Convention: LOCK(X) = variable, Lock(X) = permission
Operations on LOCK(X) requested by Ti to Lock Manager
- ...lock_item(X), lock_item(X, read), unlock(X), ...
- Protocol: Ti needs Lock(X, op) to perform op(X)
- Lock Manager: (operation on LOCK(X)) -- >
- ...{ grant, postpone, deny }
- ...See Fig. 18.1, 18.2 (pp 585-7) for algorithms
Types of locks
- binary locks: (value = 0 or 1)
- ...ops.: lock_item(X), unlock_item(X))
- ...See protocol rules on pp 585
- Read/Write locks: (values = no_lock, read_lock, write_lock).
- ...read_lock(X) is shared: Multiple read locks can be allowed on X.
- ...write_lock(X) exclusive, no other concurrent lock on X

S7: Lock based CCT: 2 PL protocol

Two-phase Locking (2PL): Ti follows 2PL if
- ...all locking ops precede the first unlock op in T
Two Phases relate to the text/execution of T
- Expand / Grow the set of permissions (Locks)
- Shrink: Release the Locks
Ex. : Does 2PL solve problems in Fig. 17.3 (pp. 556)?
Thm. If all Ti in a schedule S follows 2PL, then S is serializable
- (:-) Eqv. serial schedule= orders Ti(s) by time of getting all locks
No need to test serializability, But limits Concurrency
- Disallows/Rearranges a few serializable schedules
- Conservative: expects heavy contention
Variation of 2PL:
- To get Strict Schedules (see 17.4.2), Use Strict 2PL
- ...Strict 2PL if T commits or aborts before any unlock() operations.
- ...Strict 2PL does not avoid deadlocks.

S8: Lock based CCT: Deadlock and Livelock Problems

Livelocks or Deadlocks - possible with 2PL
Livelock and Starvation
- Priority based on age (timestamp of first submission)
Solution 1: Prevent Deadlocks
- Conservative 2PL if T acquires all locks before T starts execution.
- No deadlocks, if all locks requested together (no hold & wait).
Sol. 2: Detect and Resolve Deadlocks
- Wait-for graph (nodes = Ts, edge = wait-for dependency)
- ...cycle in WFG = > deadlock
- Break deadlock by aborting Ti to break cycles in WFG
Sol. 3: Timeouts: Ti aborts if all locks are not granted within timeout!
- ...wait-die (new time-stamp) or wait-wound (no change in time-stamp)

S9: CCT 2: Timestamp Ordering

IDEA: Make schedule eqv. to serial schedule (FCFS),
- ...defined by the ordering of time-stamps of transactions.
Thus, Abort Ti operate on data-item X, such that
- Ti reads /writes X with write_TS(X) > TS(Ti)
- OR Ti writes X with read_TS(X) > TS(Ti)
- WHERE
- ...TS(Ti) = start time of a transaction Ti
- ...read_TS(X) = largest TS(Tj) for any Tj that has already read X.
- ...write_TS(X) = largest TS(Ti), any Ti has already written X.
Ex. : Does TSO solve problems in Fig. 17.3 (pp. 556)?
Thm. Timestamp Ordering ensures serializability
- Note: Potential conflict detected before operation and avoided.
- Conflict eqv. schedule = order Ti in S by their timestamps.
Comparison of TSO, 2PL, CGS
- 2PL/TSO limit concurrency: rules out some serializable schedules
- 2PL and TSO produce different subsets of CGS

S10: Exercise 1 (Divide among groups).

T1= R1(X), W1(X), R1(Y), W1(Y) :: T2= R2(X), W2(X), R2(Y), W2(Y)
- ...S1= R1(X), W1(X), R2(X), W2(X), R1(Y), W1(Y), R2(Y), W2(Y)
- ...S2= R1(X), W1(X), R1(Y), R2(X), W1(Y), W2(X), R2(Y), W2(Y)
- ...S3= R2(X), W2(X), R2(Y), R1(X), W1(X), W2(Y), R1(Y), W1(Y)
- ...S4= R2(X), W2(X), R1(X), W1(X), R2(Y), W2(Y), R1(Y), W1(Y)
- ...S5= R2(X), W2(X), R1(X), W1(X), R1(Y), W1(Y), R2(Y), W2(Y)
Q1. Determine output of conflict graph scheduler for S1, ..., S5?
Q2. Determine output of Time-stamp scheduler for S1, ..., S5?
- ...Assume max(TS(X), TS(Y), TS(Z)) < TS(T1) < TS(T2)
Q3. Determine output of basic 2PL lock manager for S1, ..., S5?
- Use binary lock for simplying the work.
A1: ok: S1/S2 T1 - > T2; S3/S4: T2 - > T1;
- ...not ok S5: T1 < = > T2 (cycle)
A2: ok: S1/S2; not ok S3, S4 (T2 - > T1), S5 (cycle)
- Test violated for S3 @R1(X), S4 @R1(X), S5 @R1(X)
- Insight: compare each arrow in CSG with TSO (T1 - > T2)
A3: ok: S1/S2/S3/S4; not ok: S5 (cycle)
- violated for S5 @R1(Y)
- ...T2 locked Y before unlocking X for T1.
- ...T2 can not unlock Y yet = > R1(Y), W1(Y) wait = > S5 rearranged
- S4= L2(X), R2(X), W2(X), L2(Y), U2(X), L1(X) R1(X), W1(X),
- ...R2(Y), W2(Y), U2(Y), L1(Y), R1(Y), W1(Y), U1(Y), U1(X)
Self: Q? Regroup exercise by techniques rather than by schedules?

S11: Other CCTs: Versions, Optimistic

18.3 Multiversion CCT
- Uses view serializability, not conflict serializability
- Wi(X) writes a new version of X, old version of X remains
- Some conflicting ri(X) can read older versions
- Can be based on timestamps or 2PL
- ...Oracle: timestamped-versions for read and 2PL for write
18.4 Optimistic Validation Based CCT
- Idea: do all the checking at one time (commit time)
- ...No side-effect to database before validation
- Three phases: read + compute, validate, write
- Abort Ti if conflict with committed / validated Xactions
- Wins if there is small degree of conflicts across concurrent Ts
- Too many aborts if too many conflicts

S12: CCT for B-Tree indexes

Motivation
- 2 PL is correct but too restrictive
- Ex. page-locks for insert(12) in Ex. 14.12 (pp. 480)
B-link tree locking protocol
- Link nodes at each level in B-tree
- allows more concurrency
B-link Tree locking Protocol
- Index operations: point search, insert, delete
- ...insert - binary locks, locks one page at a time (no 2PL)
- ...point search - no locks, search down children or right sibling
Ex: concurrent(search(9), insert(12)) with Ex. 14.12
- search(9) starts at step 2, checks root, suspends
- ...insert(12) changes tree to that in step 4
- ...search(9) restarts follows child then sibling
Example: concurrent(insert, insert)
- Ex. 14.12 - insert(12) == > insert(2, 12)
- insert(12) finds rightmost leaf, locks it, splits it
- insert(2) finds leftmost leaf, locks it, splits it
- One of those get lock(parent = root) at a time!

S13: Other Issues: Granularity, Phantom Records, ...

18.5 Granularity/Size of Data Items
- Ex. field, record, page, file, index, ....
- Trade-off: degree of concurrency vs. size of lock table
- Smaller data item = > more concurrency, but slower lock manager
- Useful to match locking granularity with access granularity
18.7 Problem of Phantom Records
- Problem: Conflict among Insert(), Delete(), read(), write()
- Ex. T2: select e from EMP where e.dno = 5
- ...T1 insert(e in EMP with e.dno = 5) = > phantom record!
- ...Conflict in locking index entry (e.dno = 5), = > no phantom record.
- Solution 2: Predicate locking
Interactive Transatcion T
- Problem: If a user sees and uses a uncommitted value during T
- ...rollback / abort of the T may not be possible.
- Solution:Do not show uncommitted values to Users

S14: Example 2 (CCTs 1 and 2)

Large example, Divide among 9 groups by CCTs or schedules
Consider the 3 transactions T1, T2, T3 and
- 3 input action schedules S1, S2, S3 from Fig. 17.5, pp 567.
- For analyzing basic 2PL: Assume Ti obey basic 2PL
- ...???+ Ti acquires BINARY LOCK(A) just before 1st Ri(X)/Wi(X)
- To analyze Timestamp ordering, Assume
- ...max(TS(X), TS(Y), TS(Z)) < TS(T3) < TS(T1) < (T2)
Q1. Determine output of conflict graph scheduler for S1, S2 and S3?
Q2. Determine output of basic 2PL lock manager for S1, S2 and S3?
Q3. Determine output of Time-stamp scheduler for S1, S2, and S3?
Transactions:
- T1 = R1(X), W1(X), R1(Y), W1(Y)
- T2 = R2(Z), R2(Y), W2(Y), R2(X), W2(X)
- T3 = R3(Y), R3(Z), W3(Y), W3(Z)
Explanation of actions of TSO scheduler is harder
- Either use table to trace time-stamp values
- Or draw conflict arrows over schedule, and check for each arrow!

S15: Example 2 (Contd.)

Schedule S1= R3(Y), R3(Z),R1(X), W1(X),W3(Y), W3(Z),R2(Z),
- ...R1(Y),W1(Y) ,R2(Y), W2(Y), R2(X), W2(X)
Schedule S1: serializable (see Fig. 17.8f, pp 651.)
- serializable(S1) = > Conflict graph scheduler lets it pass
- Timestamp ordered(S1) = > TSO schedule lets it pass
- May be okay with basic 2PL, but not strict/conservative 2PL
- ???? may Violates basic 2PL on R2(Z),
- ...if T2 write_lock(Y), before R2(Z).
- ...Possible output (i) Serial schedules T3 - > T1-_ > T2
- ...OR (ii) serializable schedule where T2 is after last action in T1.
S2 = R3(Y), R3(Z), W3(Y), W3(Z),R2(Z), R2(Y), W2(Y),
- ...R2(X), W2(X),R1(X), W1(X), R1(Y), W1(Y)
Schedule S2: a serial schedule in order T3 - > T2 - > T1.
- Conflict graph scheduler: let it pass
- 2-phase locking scheduler: let it pass
- TS Scheduler: Ordering of Ti != time-stamp ordering
- ...Possible output: a serial schedule (T3 - > T1 - > T2) or
- ...an equivalent serializable schedule S1.

S16: Example 2 (Contd.)

S3 = R2(Z), R2(Y), W2(Y),R3(Y), R3(Z),R1(X), W1(X),
- ...W3(Y), W3(Z), R2(X), R1(Y),W1(Y), W2(X)
Schedule S3: not serializable, (see Fig. 19.8d, pp 651.).
- Thus no scheduler will allow this schedule.
- CGS:output any serializable schedule (e.g S1, S2)
- ...Conflict detected at R1(Y), postpone it to end with W1(Y),
- ...scheduling R1(Y) after W2(X) leads to cycle = > abort T1
- ...Cascaded roolbacks = > abort T2 and T3 as well!
- ...Resubmit T1, T2 , T3, Hope for a better interleaving!
- TSO: Outputs a serializable S eqv. to (T3 - > T1 - > T2), e.g. S1
- 2PL: May give all locks to T3 first, then to T1, finally to T2
- ...i.e. serial schedule T3- > T1- > T2 or eqv. serializable schedules.
- ...Watch for deadlock resolution and order of lock() requests
Schedule S4= R3(Y), R3(Z),R1(X), W1(X),W3(Y), W3(Z),R2(Z),
- ...R2(X), W2(X), R1(Y),W1(Y) ,R2(Y), W2(Y)
- 2PL will fail ????