Slides

S1: Ch. 19: Recovery In OLTP

Learning Goals:
- Learn about failures
- Learn about recovery techniques
- How to preserve ACID properties
Topics:
- 19.1 Recovery Concepts - outline, system concepts, rollback
- 19.2 Methods w/ Deferred Update - REDO
- 19.3 For Immediate (anytime) Updates - REDO/UNDO
- 19.4 Shadow Paging
- 19.5 ARIES Recovery Algorithm
- 19.6 Recovery in Multidatabase Transaction
- 19.7 Backup and Recoery from Catastrophic Failures

S2: Failure Types and Implications for OLTP

Reliable machines fail once a week
Harware Component failures
- Media Failures on Disk Drives
- ...Transient Error in read/write, no loss of data
- ...Hard failure, e.g. head crash, loss of data
- CPU Failures, Power outage = > data in inconsistent state
- ...Fail-stop processors assumption, no malicious behaviour
- ...Malicious failure causing unpredictable behaviour
Software failures
- Planned: Abort transaction due to CCT in OLTP, exceptions
- Others: design error, requirement errors
Complex failures - fire, catastrophe, combination of failures
Consequences for OLTP with ACID requirements
- Atomicity: no partial excution at CPU failures
- Consistency Preserving despite failures
- Isolation = > needs rollback
- ...Recall temporary update problem (Fig. 19.3c)
- Durability: no data loss from media failure

S3: Recovery Techniques

Media Failures: (CPU is alive, DBMS software keeps running)
- (i) Repeat operations for Transient Errors
- (ii) Replicate data to revoer from Hard Failures
- ...backups, RAID with varying replication level
- Avoid common mode of failures: disks, cities, ...
- ...Wells Fargo Bank: copies in Los Angeles and San Francisco
- ...19.7 Catastrophic failures
Fail-stop CPU failure: (lose program /main memory state)
- (i)In Persistent main memory (e.g. battery backup), Dual-CPU, ...
- ...= > If possible, move to a consistent state before shutdown
- (ii) See schemes for planned rollbacks
Planned rollbacks, e.g. due to CCTs (ch#20), Recover at restart
- ...= > Restore a past consistent state via backup + undo/redo using
- (i) Log operations on disk before carrying them out
- (ii) Isolation/redundant data-structure w/ both old + new state
- ...e.g. defer update past commit, shadow page

S4: 19.1 Recovery Concepts : Data-structures

Recovery= restore past state, reconstruct recent correct state
Recovery data-structure terminology
- System Log = history of values for in data-items
- After image= old value of data-item before update
- Before image= new value of data-item after update
- ...REDO-type log entry = < data-item, Tj, after image >
- ...UNDO-type log entry= < data-item, Ti, before image >
- General entry = < data-item, Ti, before image, after image >
- Ex. Fig. 19.1 (b) on pp 617
Checkpoints = > reduce Log size
- Periodically
- ...Stop accepting new Ti, Wait until active Tis finish
- ...Flush all log records in buffers to disk
- ...Flush all database records in buffers to disk
- ...Write "checkpoint" record on log on disk

S5: 19.1 Recovery Concepts : storage management

Memory Cache Flushing Terminology:
- Cache(Ti) = In-memory disk-pages updated by Ti
- Cache directory has < disk-page-id, memory page >
- Flush(buffer[j]): write buffer[j] to disk if (dirty bit = 1)
Write Ahead Logging = >
- record < data-tem, Ti, before image > in log
- and flush log-buffer to disk,
- before in-placeing update of data-item

S6: 19.1 Recovery Concepts : storage management

Buffering Policies : Steal/No-steal, Force/No-force
- Q? Can a buffer updated by Ti be flushed before commit(Ti)?
- ...Yes = Steal (before commit); No = No-steal
- Q? Can a buffer updated by Ti be flushed much after commit(Ti)?
- ...No = Force (at commit) ; Yes = No-Force
Disk block Update Terminology:
- Disk Location of Update
- ...In-place update = At flush, overwrite corresponding disk-page
- ...Shadowing = At flush, write buffer at a new location (2 copies)
- Time of Update
- ...Deferred Update = database update deferred till commit
- ...Immediate Update = database may be update before commit
- ...better name is anytime update!
Q? Does deferred update implies no-steal buffering policy?
Q? What does immediate update imply for buffering policy?
Q? Which buffering policy allows sharing across transactions?

S7: 19.1 Recovery Concepts : Rollbacks

Rollback/UNDO Ti = restore data-items changed by Ti
- ...to their previous values
- UNDO log entries can be used
- Cascaded rollbacks possible, if schedules are not strict
- ...= > Information about read(Ti, X) needed in system log
- ...No cascading = > read(Ti, X) not needed in log
Cascaded Rollback Example : Fig.19.1/pp 617
- Fig.19.1(a):Transactions; (b) log
- Note: log contains schedule (both read() and write())
- No transaction has committed = > all are rolled back
- Even if T1 and T2 had committed,
- ...rollback of T3 = > cascaded rollback of T2
- ...Test for cascade_less (17.4.2, pp. 565)

S8: Logging Rules

Write-ahead logging (WAL) logging
Redo logging rules
- For every action, generate redo log record
- Before X is modified on disk for actions of Ti,
- ...all log records (including commit) for Ti must be on disk
- Flush log records in buffer at commit
Redo/Undo logging rules
- Page X can be flushed before or after Ti commits
- Log records flushed before corresponding data page (WAL)
- Flush logs at commit
Deferred update = > assume WAL and redo logging rules
Immediate update = > assume WAL and redo/undo logging rules
Other: Undo logging rules
- For every action, generate undo log record
- Before X is modified on disk for actions of Ti,
- ...log records pertaining to X for Ti must be on disk
- Before commit is flushed to log,
- ...all writes of transaction must be reflected on disk

S9: 19.2.2 Recovery Assuming Deferred Update

Data-structure: Log has redo record < Ti, X, after_image >
- For every action, generate redo log record
- Before X is modified on disk for actions of Ti,
- ...all log records (including commit) for Ti must be on disk
- Flush log records in buffer at commit
Implementation Properties: (No-steal, No-Force)
- Defers actual updates to DB by Ti until after commit(Ti)
- During execution(Ti), updates recorded only in
- ...Log and memory buffer workspace(Ti)
- (19.2.3) Non-database actions: Spool and print only if Ti commits
Recoverable_write(Ti, X, bfim, afim)
- Write(log, memory buffer, < Ti, X, afim > )
- Write(data, memory buffer, X, afim)
Recoverable_commit(Ti)
- Flush(log) from memory buffers to disk
- Mark(data, memory buffers written by Ti)
Recoverable_abort(Ti)
- No action needed since Ti has not changed anything on disk
Database_Writer
- Flush "marked" memory buffers of committed transactions to disk
- Periodically, At checkpoints
Restart Algorithm RDU_M (pp 620)
- 0. Assume strict 2-phase locking (no cascaded rollback)
- 1. Restore database state to last checkpoint / backup
- 2. If commit(Tj) in Log = > REDO all write(Tj, X, ...) in Log order
- 3. Resubmit active Ti, which did not start commit before failure
Rationale
- commit(Tj) IN log = > Tj's completed but update may not be flushed
- commit(Tj) NOT IN log = > Tj did not update database
- ...No UNDO due to deferred update
Q? List actions of RDU_M for Fig. 19.4(pp 622), assume strict 2PL
Q? How will the recoverable_abort() and restart algorithm change if
- (a) recoverable_commit(Ti) flushes all data-items written by Ti.
- (b) the schedule is not strict

S10: 19.3.2 Recovery Based on Immediate (Anytime) Update

Data-structure: Log with undo/redo record < Ti, X, before, after >
- Page X can be flushed before or after Ti commits
- Log records flushed before corresponding data page (WAL)
- Flush logs at commit
Implementation Properties (Steal/ no-force)
- No cascaded rollback: strict 2-phase locking
- DB may be updated anytime after Ti write_item(X)
- ...UNDO needed if Ti fails/aborts or commit(Ti) not in Log
- DB updates may not be flushed when commit record goes to Log
- ...REDO is needed even if commit(Ti) is in Log
Recoverable_write(Ti, X, bfim, afim)
- Write(log, memory buffer, < Ti, X, bfim, afim > )
- Write(data, memory buffer, X, afim)
Recoverable_commit(Ti)
- Flush(log) from memory buffers to disk
- Mark(data, memory buffers written by Ti)
Recoverable_abort(Ti)
- Fetch log records of Ti
- Restore data-items modified by Ti to their before_images
Database_Writer
- Flush selected memory buffers to disk
- Periodically, At checkpoints,
- To free up buffers using a replacement policy
Restart Algorithm RIU_M (pp 623)
- 0. Assume strict schedule (no cascaded rollback)
- 1. CT = list of committed Ti ; AT = list of active Ti
- 2. For Ti in AT, UNDO writes of Ti in reverse Log order
- 3. For Tj in CT, REDO writes of Tj in Log order
Q? How will the restart algorithm change if
- (a) recoverable_commit(Ti) flushes all data-items written by Ti.
- (b) The schedule is not strict

S11: Exercises on Recovery Algorithms

Compare RIU_M and RDU_M for
- Memory buffer reuse interval
- Computation and I/O overheads
Can redo overheads be reduced using more log record types
- ...e.g. force_start(Ti), force_end(Ti)
- Justify using (revised) logging rules, examples, proofs.
Consider algorithm RDU_M (deferred update, strict schedule):
- Q?. Can deadlocks occur? (Yes?)
- Q? Do deadlocks affect recovery alg.?
- ...(No, as it only affects active Ti)
- Q?. How will recovery alg. change if 2PL is not strict?
- ...No change since = > cascaded rollback not possible!
Consider algorithm RIU_M (immediate update, strict schedule):
- Q?. Do undo/redo operations need concurrency control ?
- Q?. Can deadlocks occur?
- Q? Do deadlocks affect recovery alg.?
- ...Yes, Undo due to deadlock abort Ti = > cascade
Q? Develop Undo/No-Redo alg. assuming the following
- For every action, generate undo log record
- Before X is modified on disk for actions of Ti,
- ...log records pertaining to X for Ti must be on disk
- Before commit is flushed to log,
- ...all writes of transaction must be reflected on disk

S12: 19.4 Shadow Paging

Recovery without Logs for Files/Paged Structures
- Ex. Fig. 19.5 (pp 625)
Properties of Paged Structure (File)
- FileHeader block is always consistent (PT, datapages)
- Redundant data structures (both page table(PT), data pages)
- Shadow PT = old consistent state; current PT = > modified state
- At begin(Ti) = > copy shadow PT (disk) to current PT (memory)
- At write(X) = > copy page(X) to newpage(), update newpage(X)
- ...current PT entry[page(X)] = pointer to newpage(X)
- At commit(Ti) = > save current PT on disk;
- ...update fileHeader block to point to current PT;
- ...and garbage collect shadow PT
Failure Cases: before/after update of fileHeaderBlock
- No special Undo/Redo, just garbage collect
Pros: simple recovery ::
- Cons: garbage collection, destroys page-clusters

S13: 19.5 ARIES Recovery Algorithm

Will not be covered !

S14: Refinements - Performance Tuning

Issues: Conflicting goals
- Reduce restart time after a failure
- Reduce overheads during normal time
- Speed up commit, abort, etc.
Ideas to reduce restart time
- Reduce log-file length - Checkpoints
- Eliminate redos, e.g. deferred update (no-steal buffering)
- Eliminate undos, e.g. force buffering
- Eliminate undos and redos, e.g. shadow paging
Ideas to reduce overheads during normal times
- Reduce flush operations (and resulting disk I/Os)
- ...group-commits
- Speed-up processing of aborts
- ...strict schedules - avoid cascaded rollbacks
- ...separate undo log from redo log
- ...index undo log records by transaction-id
- ...deferred updates