Slides

S1: Distributed Databases

Learning Goals:
- Understand Distributed Databases
- New environment, new challenges
- Solutions: New features in database server
Outline
- Distributed Databases - motivation, challenges,
- Conceptual DM: heterogenity, autonomy
- Logical DM: links
- Physical DM: storage, query processing
- OLTP : recovery, concurrency control
Readings
- Textbook Chapter 25

S2: Distributed Databases - introduction

PUT IN MORE OF INTERNET, MOBILE COMPUTING
Motivation
- Distributed computing - LAN, WAN, internet
- Opportunities to improve
- ...Availability, Reliability via redundancy
- ...Performance via division of work
Computing Environment (Fig. 25.1, pp. 806)
- Criteria - relative homogenity of nodes
- ...Homogeneous or parallel: shared nothing / memory /disk
- ...Heterogeneous: centralized, truly distributed
Challenges
- Conceptual - schema heterogenity, autonomy
- Logical - transparency, SQL heterogencity
- Physical - replication, fragmentation, allocation
- ...communication costs
- OLTP - site and communication failures

S3: Conceptual DM: heterogenity, autonomy

25.3 Classification Criteria
- A. Degree of homogenity
- ...Choices of Data models, constraints, query languages
- ...Semantic - attributes of an entity, attribute names,
- ...meaning of an attribute, isolation level choices, ...
- B. Degree of local autonomy
- ...Data model design autonomy
- ...Communication autonomy
- ...Execution autonomy
- ...Association autonomy
25.3 : Categories of Distributed Databases
- A. Distribution Transparent - homogeneous, no local autonomy
- ...looks like a single centralized DB to users (Fig. 25.1(b))
- B. Multidatabase - no common schema
- C. Federated - common schema, but local autonomy
- ...hardware and software may be heterogeneous
Schema hierarchy in federated databases (Fig. 25.5 , pp. 819)
- local, component (translation to canonical data model)
- ...export (public view), federated (global view to DBAs)
- ...external (view to user group)
Federated schema built with consensus
- Data interchange standards complement federated schema
- XML: syntax for federated schema and data in federated schema
- ...has momentum with support of major IT vendors
- ...e.g. MathML, OFX (financial data), etc.
- ...Imagine TurboTax fetching your W-2s, 1099s, ...

S4: Conceptual DM: Schema Integration Example

Engineering Database - Relational Schema
E(eno, ename, title), p.k. = eno
J(jno, jname, budget, loc, cname), p.k. = jno
G(eno, jno, resp, dur), p.k. = eno, jno
S(title, sal), p.k. = title
Employee Database - CODASYL Schema
Department(dept-name, budget, manager)
Employee(e#, name, address, title, salary)
Department Employs Employee (1:N relationship)
Database III - Entity Relationship Model
Entities
Engineer(Engineer No, name, title, salary)
Project(PNo, project name, budget, location)
Client(Client Name, Address)$
Relationships (N : 1)
Engineer Works_In Project : (Responsibility, Duration)
Project Contract_By Client : (Contract Date)
Q? Design an integrated schema for the common database.

S5: Conceptual DM: Schema Integration Example

One Solution (Entity Relationship Diagram)
- Entities
  Department(dname, budget, manager)
  Employee(E#, name, title, address, salary)
  Engineer()
  Project(PNo, Pname, Budget, Location)
  Client(Client Name, Address)$
- Relationships
  Engineer() is a subtype of Employee
  Department Employs Employee (1:N)
  Engineer Works_In Project (M:N) : (Responsibility, Duration)
  Project Contracted by Clients (M:N) : (Contract Date)

S6: Logical DM: links

Source: Section 25.7 (Oracle SQL)
Global Naming of data objects (e.g. tables):
- < tablename.@databasename > , e.g. emp@sales
Database Links for one way communication
- database (e.g. sales) from domain (e.g. us.america)
  create databsae link sales.us.america ;
Database Replication
- Read-only : replicated master table
- Read-Write : snapshot with periodic refresh
  create snapshot sales.orders
  as select * from sales.order@hq.us.americas;
Q? Do these facilities supported conceptual DM needs?
...e.g. federated database management ?

S7: Physical DM: storage

25.2 Storage - Fragmentation, Allocation, Replication
- Example - central data (Fig. 5.5-6, pp. 136-7)
- ...distributed data (Fig. 25.3, 25.4, pp. 814-5)
Fragmentation of a table
- Horizontal: divide rows (selection operation)
- ...Fig. 25.7, pp. 823 using 2 sites
- Vertical: divide columns (project operation)
- Mixed: divide both rows and columns
Replication degrees
- Non-redundant replication - each fragment at one site only
- Full replication - all site have all data
- selective replication -
- Q? Determine replication degree for each fragment in example.
Allocation
- Assign each copy of each fragment to a site

S8: Physical DM: Query processing

Processing Query over distributed databases
- 25.4.2 data tranfer strategies
- ...Communication cost besides CPU cost, I/O cost
- 25.4.3 Query decomposition and allocation
Example (Fig. 25.6 pp. 820)
- Data at site 1 and site 2, Query at site 3
  select fname, lname, dname from employee E, department D
  where E.dno = D.dnumber
Data transfer strategies - conventional
Send both tables to site 3
Send table E to site 2 and then result to site 3
Send table D to site 1 and then result to site 3
Semi-join data tranfer strategy -
Site 2: Send T1 = project(D, dnumber) to site 1
Site 1: T2 = project ( join(D1, E), dnumber, fname, lname)
...and send T2 back to site 1
Site 2: T3 = join(T2, D)
...Site 2: send final result = project(T3, ...) to site 3
semi-join(R, S, A=B) = project(columns of R, join(R, S, A=B) )
Q? Is semi-join(A, B, A=B) < > semi-join(S, R, A=B)?
When is semi-join strategy cheaper than conventional strategies?
...tables are large but semi-join result is small
...lots on irrelevant columns, very small join selectivity

S9: Physical DM: Query processing

25.4.3 Query decomposition and allocation
- Goal: support data distribution transparency
- Task: divide query according to data fragmentation
Example
- Horizontal data fragmentation - Fig. 25.7 pp. 823
- Query Q (bottom of pp. 824)
- decomposition of Q (pp. 824)
A Simple Method based on query rewriting
- Assume: non-redundant replication
- Assume: federated schema gives view definition for tables
- ...it may involve joins among vertical fragment and
- ...union among horizontal fragments
- 1. Rewrite query: replace tables by their view definition
- ...New query has fragments in "from"/"where" clauses
- 2. Prune fragment list in "from" clause
- ...Omit fragment F if it can not contribute tuples to final result
- ...Vertical F: columns(F) intersect column(select clause) = null
- ...Horizontal F: contradiction(guard(F), selection_condition(Q))
- 3. Generate execution plan for pruned query
- ...Tranfer strategy choices for each operation in query tree
- ...choice 1: bring all fragments to one site
- ...choice 2: semi-join transfer strategy for join operation

S10: OLTP : 25.5 Distributed Concurrency Control

Q? Identify new OLTP issues in
- Distributed Locking
- Distributed Deadlock Detection
Locking for replicated, fragmented records
- How to manage locks for objects across sites?
- Centralized - one site does all locking.
- Primary Copy - locking of an object done at
- ...site holding primary copy of the object
- Fully distributed - locking for a copy done at
- ...site holding specific copy of the object
- ...Write_lock - must lock all copies
Deadlock Detection - Construction of wait-for graph
- Each site maintains local wait-for graph
- Global deadlock may occur even if acyclic local graphs
- Solutions:
- ...Centralized - collect local WFGs at one site
- ...Time-out - abort Xact if it waits too long for a lock

S11: OLTP : 25.5.3 Distributed Recovery

New Issues
- New kinds of failures - link, remote sites
- Distributed transactions - updates data at many sites
Assumptions
- Logs are maintained at each site as before
- New commit protocol for transaction T
- Coordinator C = site where T originated from
- Subordinate sites Si = site where T executes
- ...S = {S1, S2, ... Si, ... }
Two phase commit protocol for commit(T)
- C sends prepare to each Si
- Si : EITHER { force write prepare log record ; send yes to C}
- ... OR { force write abort log record ; send no to C }
- C : if all Si sent yes then
- ...{ force write commit log record ; send commit to each Si }
- ...else { force write abort log record ; send abort to each Si }
- Si : if msg from C is commit then
- ...{ force write commit log record ; send ack to C}
- ... else { force write abort log record ; send ack to C }
- C : if all Si sent ack then write end log record
Properties of Two phase commit protocol
- 2 phases of communications : voting, termination
- Any site can decide to abort a Xact in first phase
- Every message reflects a decision by the sender
- ...decision is recorded in log for fault-recovery
- Si's 2PC log records have Xact-id and coordinator-id
- C's 2PC log records have Xact-id and Si-ids for all Si

S12: OLTP : Distributed Recovery

Q? What happens for following failures:
- A. if C fails during 2PC
- B. if Si or link to Si fails during 2PC?
- C. if C fails between sending prepare
- ...and writing commit/abort log
- A. Transaction T is blocked waiting for C
- B. Phase I : C can abort T, Phase II: depends !
- C. C aborts T during restart
Recover during Restart after a failure at Si
- If log has prepare(T) but not commit(T)/abort(T)
- ...contact C to find status of T,
- ...write commit/abort log record, redo/undo T
- ...write end(T) log record
- If log has commit(T)/abort(T) but not end(T)
- ...redo/undo T based on consensus decision in Phase I
- ...send ack to T
Recover during Restart after a failure at C
- If log has no prepare(T) = >
- ...abort T and undo its effect
- If log has prepare(T) but not commit(T)/abort(T)
- ...can abort T, send abor(T) to Si
- If log has commit(T)/abort(T) but not end(T)
- ...(re)send commit/abort msgs to Si to get acks
- ...redo/undo T based on consensus decision in Phase I
- ...watch for message from Si about T
Properties fo recovery algorithm
- Ack message let C know when it forget T
- ...C track T until all acks are received
- If T does not update data at Si
- ...then its comit/abort status is irrelevant