S1: Ch. 17: OLTP (OnLine Transaction Processing) Concepts
- Learning Goals: Data management in OLTP Applications
- OLTP Applications and their Characteristics
- Examples
- Units of work, Thru'put, concurrency
- Performance issue (1000 TPS) : out of scope
- Correctness issue : database consistency
- ...despite bad interleavings, failure (partial execution)
- Architecture of OLTA
- Basic Concepts
- Simplified Model of transactions
- ...Concurrency = > Correctness issue (bad interleaving, failures)
- Simplified Model of TP Monitor
- Desirable Properties of Transactions: ACID
- Schedules, recoverability, serializability (test, usage)
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: OLTP Applications and their Characteristics
- Transaction = a business deal (e.g. a sale)
- Transaction Processing = book-keeping about transactions
- $60B industry [G. Held of Tandem, 93]
- E-commerce: amazon.com, e-bay, ...
- Airlines: researvation system
- Banking: ATM, electronic fund transfer, Home banking
- Securities: NY Stock Exchange, NASDAQ: 1-2 Billion/day
- ...Chicago board of trade (futures) - 200 Million/day
- Telecommunications: call billing, operator services
- Point-of-sale and Retail: credit cards/checks, sales
- Manufacturing: Just-In-Time inventory control
- Student Access System, SAS, (500,000 calls/week), ..
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: Characteristics:
- Consistent on-line database
- ...despite failures, interleavings etc.
- Small work units - e.g. transfer, deposit, withdaraw
- High throughput = #transactions per second (TPS)
- High Concurrecncy = #units overlapping a given unit
- Ex. List work units, throughput and concurency for NYSE and SAS:
- NYSE: 1B/day, 5 sec. each :: > 10K TPS, 50K overlaps/unit
- SIS: 100K/day, 1 minute each:: > 1 TPS, 50 overlaps each unit
- List the concurrency levels supported by
- Operating systems e.g. Unix, Windows NT
- Web servers (e.g. IIS, Netscape, Apache) for CGI
- Database servers e.g. Oracle, SQL server, DB2, Sybase, Informix
- Transactions vs. Threads vs. Processes
- List the Transactions per minute rating of modern databases
- visit http://www.tpc.org/default.asp
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: Units of Work (transactions)
- Relationship to organizational workflow and forms
- Decompose work into special units
- Often a form represents one unit of work
- Ex. Bank: Forms for deposit, withdraw, transfer, report, ...
- ...Point-of-sale: purchase, return, ...
- ...Airline reservation: reserve, cancel, ...
- Informal properties of Work Units
- Small: 1 - 30 I/Os and 10K-1M instructions
- Independent: open a file, update a record, close the file
- Single function: e.g. build B+tree index to a file
- Recoverable: failure between a bank transfer
- Q? Which of the following are individual transactions?
- A visit to an ATM / bank , A purchase at a retail store
- a SQL query, an ODBC call, A C program w/ embedded SQL
- Types: on-line vs. batch; conversational vs. non-conversational.
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: 17.1-2 Simplified Model of Transactions
- Transaction = execution of a user program accessing database
- multiuser database, interleaved concurrent user programs
- Transaction: operations and states
- 17.1.2 Database Operations: Ex. Fig. 17.2, pp 555
- ...read_item(X): read database item X into program variable X
- ...write_item(X), begin-trans, ....
- 17.2 Transaction states and other operations (Fig. 17.4, pp 560)
- ...States: active, partially committed, committed, failed, terminated
- ...Ops: begin/end transaction, commit, abort, read/write
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: 17.3-4 Characterizing Transactions and Schedules
- Desirable Properties (ACID)
- Atomicity: performed all actions or none
- Consistency Preserving: moves database across consistent states
- Isolation: T's updates are not visible to other Ts before commit
- Durability: committed changes to DB are permanent
- Schedule of concurrently exceuting interleaved transactions
- Def. schedule S (T1, ..., Tn) = an ordering of operations of Ti, s.t.
- ... order of operations of Ti in S = order of operations in Ti, forall Ti.
- ordering could be total or partial.
- Ex. Fig. 17.3(a), (b) :: shorthand ri(X), wj(Y), ci, aj, ...
- Committed projection: C(S) = operations from committed Ts
- Q1? How many possible schedules for following transaction sets
- T1: r1(X), w1(X) ; T2: r2(X), w2(X)
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: 17.4.1 Exercises
- Q1? How many possible schedules for
- Xactions in (a) Fig. 17.2 (pp. 555)?
- Xactions in (b) Fig. 17.8 (pp. 573)
- Q2? Count number of schedules for T1, ..., Tn,
- ...Where Ni = number of operations in Ti
- ...and N = N1 + N2 + ... + Nn
- Q3? Design a representation for the space of (partial) schedules
- Q4? Design a algorithm to enumerate the (partial) schedules
- A1: (a) (9!)/(6!*3!) = 84 ::: (b) (13!)/(5!*4!*4!) = 90900
- A2: The ordering of Ni actions within Ti preserved = >
- ...lose (Ni!) permutations for each valid schedule
- #schedules S(T1, ..., Tn) = N! / (N1! * N2! * ... Nn!)
- ... = #permutations / ((#permutation in T1)*...*(... in Tn))
- A3: Representation = Tree
- Nodes at level k = { Sk : partial schedule of length k }
- ...Root = empty schedule, leafs = complete schedules
- Edge from a S(k) to a S(k+1) : Max.#Children = #Ts = n
- A4: General idea = tree traversal
- BFS - take a S(k) and generate a S(k+1) for each non-empty Ti
- ...Start = < empty schedule, remaining actions of T1, ..., Tn >
- DFS - recursion, a recursive call for each non-empty Ti
- ...Base case = no remaining actions
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: Components of the TP Monitor, and Transactions
- Three Components
- -- > [ Transaction Scheduler ] -- > request grant/deny/delay
- -- > [ Resource/Data Manager] -- > carry out actions tentatively
- -- > [Commit Manager] -- > abort/commit tentative actions
- Debit-Credit Xact [A = account#, B= branch#, T = teller#]
- try{read 100 bytes from terminal;
- start_trans();
- ...A.account_balance =+ delta;
- ...rewtrite new balance to account A;
- ...log A, T, B, delta, timestamp
- ...T.teller_balance =+ delta;
- ...B.branch_balance =+ delta
- commit_trans(); /* onerror(){ put in batchfile / abort(); } */
- write 200 bytes to terminal;}
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: 17.1 Need for Concurrency Control and Recoveraby
- 17.1.3 Why Concurrency Control is needed?
- Problems due to uncontrolled concurrent exceution:
- ...Lost update [Fig. 17.3(a), pp 556-7]
- ...Incorrect summary [Fig. 17.3(c), pp 557]
- 17.1.4 Why Recovery is needed?
- Temporary update or dirty read problem [Fig. 17.3(b), pp 556-7)
- ...When T1 fails, T2 may have to be rolled back
- Avoid partial execution due to failures
- ...Atomicity: either all actions in a transaction are completed or
- ...it has no effect whatsoever on database or other transactions
- Failure types (pp 558): Catastrophe, system(hw / disk / OS),
- ...transaction error/exception, rollback due to concurrency control
- 17.2: Use system log, commit points
- Q1? Consider all schedules for following transaction sets
- T1: r1(X), w1(X) ; T2: r2(X), w2(X)
- Which schedules have lost update problem?
- Which schedules have Temporary update or dirty read problem?
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S10: 17.4.2 Recovery and Rollbacks
- Recovering from failure may lead to 3 kinds of rollbacks
- A. Undo actions of an uncommitted transaction
- B. Rollback of a committed transaction
- C. Cascased rollback of multiple committed transactions
- Why many applications avoid B or C ?
- B. ATMs - some actions (e.g. dispensed cash) can not be rolled back
- C. lead to slow recovery
- How to avoid B ?
- Strict schedule
- "Recoverable without rollback" schedule
- How to avoid C ?
- Note: some systems may allow other schedules
- Recoverable with cascaded rollback
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S11: 17.4.2 Recoverability Properties of Schedules
- reads_from(Ti, Tj, S)= Ti reads from Tj in S iff
- ...(wj(X) < ri(X)), (not(aj < ri(X))), (no wk(X) b/w wj(X) and ri(X) in S)
- 3 levels of recoverability of Schedule S
- Strict_schedule(S) iff (wj(X) < cj < ri(X)), (wj(X) < cj < wi(X))
- ...no concurrent read/write on any data item!
- ...recovery is easy = > restore before image of X
- "Recoverable" without any rollback
- ...iff forall Ti,Tj in S: reads_from(Ti,Tj,S) = > (cj < ci)
- ...Assumes abort(Tj) can not rollback commited transactions.
- Recoverable without any cascaded rollback
- ...Cascade_less(S) iff reads_from(Ti,Tj,S) = > (cj < ri(X))
- ...No dirty read (strong condition to avoid cascaded rollbacks)
- Ex. Identify the level of recoverability for following schedules
- Sa: r1(X), w1(X), r2(X), r1(Y), w2(X), w1(Y), c1, c2
- Sb: r1(X), w1(X), r2(X), r1(Y), w2(X), w1(Y), c2, c1
- Sc: r1(X), w1(X), r2(X), r1(Y), w2(X), c2, a1
- Se: r1(X), w1(X), r2(X), r1(Y), w2(X), w1(Y), a1
- Sg: r1(X), w1(X), r1(Y), w1(Y), c1, r2(X), w2(X), c2
- Strict: Sg
- Recoverable without rollback: Sa
- Recoverable without cascaded rollback: Sg
- Sc needs to rollback committed transaction T2
- Se needs to rollback T2 when T1 aborts
- Ex. Which of the following problems are present in each
- Lost update problem, temporary update problem
- Dirty read problem, non-repeatable read, phantom
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S12: 17.5 Concurrency Control and Serializability
- Uncontrolled Concurrency can lead to
- ...Lost update [Fig. 17.3(a), pp 556-7]
- ...Incorrect summary [Fig. 17.3(c), pp 557]
- How to avoid lost update and incorrect summary problems?
- Use Serial schedules
- ...serial schedules are correct by definition
- ...however, may severely limit degree of concurrency
- ...unlikely to get 1000 transactions per second
- Use serializable schedules
- ...schedules whose results are same as some serial schedules
- Q? How can we test for serializable schedules?
- Why is serializability interesting?
- It is the characterization of "correctness", legal interleavings
- Serial, Non-serial and serializable schedules (Fig. 17.5, pp 567)
- Serial S: no interleaving, i.e. all actions Ti together in S
- Nonserial S: interleaving, for some Ti, all actions are not together
- Serializable S, if S is eqv. to some serial schedule of same Ts
- ...used as correctness measure
- Equivalence of schedules S1, S2
- Result eqv.: if they produce the same final state of DB
- ...not enough for correctness, see Fig. 17.6 (pp 569)
- Conflict eqv. order of any two conflicting operation is identical
- ... < Wi(X),..., Wj(X > or < Wi(X), ...,Rj(X) > or < Ri(X), ...,Wj(X) >
- Other notions of serializability: View Eqv. (17.5.4)
- Conflict Serializable(S) iff conflict-equivalent(S, a serial schedule)
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S13: 17.5.2 Testing Conflict serializability (Algo. 17.1, pp 570)
- Step 1. Construct precedence graph
- Nodes = transactions T1, T2, ..., Tn
- Edges = conflists (read/write, write/read, write/write)
- ...(Ti -- > Tj) if before(wi(X),rj(X)) in S
- ...(Ti -- > Tj) if before(ri(X),wj(X)) in S
- ...(Ti -- > Tj) if before(wi(X),wj(X)) in S
- Step 2. No cycle in predence graph(S) IFF conflict serializable(S)
- Testing the algorithm
- Revisit Fig. 17.3(a) and 17.3(c), pp. 556-7
- Should these schedules be conflict serializable?
- Use test to check if these schedules are conflict serializable.
- Q? Is this algorithms correct?
- Proof: No cycle = > construct an eqv. serial schedule
- Revisit schedules Sc, Sd, ..., Sg on Recoverability slide.
- Which of these are conflict serializable?
- Does conflict serializable imply recoverability?
- Does recoverability imply conflict serializability ?
- Q?Can this algorithme be used in OLTP?
- Q? Can interleaving be scheduled? When does a schedule begin/end?
- ...What a new Tk arrives during the algorithms run?
- Practices: Use simple protocols like locking (Ch. 20)
- ...Protocol for Ti, checks local to Ti/data-items, (not global).
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S14: 17.6 Transactions in SQL
- Syntax (Example: pp. 655-6)
- Each SQL statement is a transaction
- Begin_Transaction : implicit (ie no syntax)
- End_Transaction : commit, rollback
- Concurrency control: Set Transaction
- ...Access Mode : read only, read write
- ...Diagnostic area size: number of exceptions cached
- ...Isolation level: Table 17.1 (pp. 578)
- Violations
- V1: Dirty read - reads uncommitted values
- V2: Nonrepeatable read -
- ...Ti: ri(X), ri(X) = > different values of X
- V3: Phantoms - scans part of table without predicate lock
- Isolation level
- SQL2 serializable = > V1, V2, V3 not allowed
- Repeatable read = > V1, V2 not allowed
- Committed Read = > V1 not allowed
- Uncommitted Read
- Q? Identify the schedule property for each Isolation level:
- strict, cascade_less, recoverable with cascaded rollback
- serial, non-serial but serializable, non-serializable
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S15: Exercises
- Ex. Consider the schedule in Fig. 17.5 (pp 567)
- Classify schedules in Fig. 17.5 as serial, serialable, nonserial
- Find pairs of eqv. schdules under (i) result eqv., (ii) conflict eqv.
- Ex. Fig. 17.5 : check schedules C and D for serializability.
- Solution and conflict graphs in Fig. 17.7 (pp 571)
- Ex. Fig. 17.8(a-c) (pp 573) check serilizability of schedules E anf F
- Solution and graphs in Fig. 17.8 (d-f)
- Q? Compare SQL-serializable and conflict-serializable
- Q? Define ACID properties for ideal transactions.
- Which ACID properties are guaranteed by SQL2 transactions?
- Q? Compare DBMS transactions with OS threads.
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S16: Components of an OLTP Application
- [User] -- [Forms, Menus, UI] -- input messages -- >
- -- > [Client program: group messages, add actions* ] -- >
- ...extra actions for logging, commit etc. (see TPC-A debit-credit )
- -- > transaction, i.e. request for a group of actions on DB/files -- >
- -- > [ TP monitor : forward/deny/ delay requests ] -- > sequence of actions
- ...check configuration spec, other active transactions
- -- > [Server routines: carry out actions ] -- > actions against DB/files
- Other Components:
- Forms management system
- Configuration specification facility
- Application configuration= system catalog, max. #processes, max #transactions, SubItem%...message queue (maxsize), max message size, security info. etc.
- Ex. Identify the following components in a banking ATM / SIS/ ...
- Forms/Menus - welcome, user identification, command-list, ...
- Transactions, i.e. message groups - withdraw, deposit, transfer, ...
- Additional actions added: start_trans, log, commit/abort, end_trans
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S17: Architecture of OLTA
- Four pieces: client, server, config. spec., TPM
- Architecture 1: Single process single thread: one process / user
- + simple secure TP, - overhead of memory/context switch
- concurrency control/recovery by OS/file system/DB
- Architecture 2: Single process/many threads
- One process for all users, one thread/user
- All I/O request via TPM, TPM does context switch for sync. I/O
- Complex TPM but less overheads
- DB transaction and Client program/TPM transaction may differ!
- ...in unit of lock granulairuty / recovery
- Architecture 3: Client / server
- Client may be single threaded or multi-threaded
- Server may be single threaded or multi-threaded
- Middleware: name server, message queues, IPC, ...
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)