S1: Distributed Databases
Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: Distributed Databases - introduction
  • PUT IN MORE OF INTERNET, MOBILE COMPUTING
  • Motivation
    • Distributed computing - LAN, WAN, internet
    • Opportunities to improve
    • ...Availability, Reliability via redundancy
    • ...Performance via division of work
  • Computing Environment (Fig. 25.1, pp. 806)
    • Criteria - relative homogenity of nodes
    • ...Homogeneous or parallel: shared nothing / memory /disk
    • ...Heterogeneous: centralized, truly distributed
  • Challenges
    • Conceptual - schema heterogenity, autonomy
    • Logical - transparency, SQL heterogencity
    • Physical - replication, fragmentation, allocation
    • ...communication costs
    • OLTP - site and communication failures


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: Conceptual DM: heterogenity, autonomy
  • 25.3 Classification Criteria
    • A. Degree of homogenity
    • ...Choices of Data models, constraints, query languages
    • ...Semantic - attributes of an entity, attribute names,
    • ...meaning of an attribute, isolation level choices, ...
    • B. Degree of local autonomy
    • ...Data model design autonomy
    • ...Communication autonomy
    • ...Execution autonomy
    • ...Association autonomy
  • 25.3 : Categories of Distributed Databases
    • A. Distribution Transparent - homogeneous, no local autonomy
    • ...looks like a single centralized DB to users (Fig. 25.1(b))
    • B. Multidatabase - no common schema
    • C. Federated - common schema, but local autonomy
    • ...hardware and software may be heterogeneous
  • Schema hierarchy in federated databases (Fig. 25.5 , pp. 819)
    • local, component (translation to canonical data model)
    • ...export (public view), federated (global view to DBAs)
    • ...external (view to user group)
  • Federated schema built with consensus
    • Data interchange standards complement federated schema
    • XML: syntax for federated schema and data in federated schema
    • ...has momentum with support of major IT vendors
    • ...e.g. MathML, OFX (financial data), etc.
    • ...Imagine TurboTax fetching your W-2s, 1099s, ...


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: Conceptual DM: Schema Integration Example
  • Engineering Database - Relational Schema
    E(eno, ename, title), p.k. = eno
    J(jno, jname, budget, loc, cname), p.k. = jno
    G(eno, jno, resp, dur), p.k. = eno, jno
    S(title, sal), p.k. = title
  • Employee Database - CODASYL Schema
    Department(dept-name, budget, manager)
    Employee(e#, name, address, title, salary)
    Department Employs Employee (1:N relationship)
  • Database III - Entity Relationship Model
    • Entities
      Engineer(Engineer No, name, title, salary)
      Project(PNo, project name, budget, location)
      Client(Client Name, Address)$
    • Relationships (N : 1)
      Engineer Works_In Project : (Responsibility, Duration)
      Project Contract_By Client : (Contract Date)
  • Q? Design an integrated schema for the common database.


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Conceptual DM: Schema Integration Example
  • One Solution (Entity Relationship Diagram)
    • Entities
      Department(dname, budget, manager)
      Employee(E#, name, title, address, salary)
      Engineer()
      Project(PNo, Pname, Budget, Location)
      Client(Client Name, Address)$
    • Relationships
      Engineer() is a subtype of Employee
      Department Employs Employee (1:N)
      Engineer Works_In Project (M:N) : (Responsibility, Duration)
      Project Contracted by Clients (M:N) : (Contract Date)


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: Logical DM: links
  • Source: Section 25.7 (Oracle SQL)
  • Global Naming of data objects (e.g. tables):
    • < tablename.@databasename > , e.g. emp@sales
  • Database Links for one way communication
    • database (e.g. sales) from domain (e.g. us.america)
      create databsae link sales.us.america ;
  • Database Replication
    • Read-only : replicated master table
    • Read-Write : snapshot with periodic refresh
      create snapshot sales.orders
      as select * from sales.order@hq.us.americas;
  • Q? Do these facilities supported conceptual DM needs?
    • ...e.g. federated database management ?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: Physical DM: storage
  • 25.2 Storage - Fragmentation, Allocation, Replication
    • Example - central data (Fig. 5.5-6, pp. 136-7)
    • ...distributed data (Fig. 25.3, 25.4, pp. 814-5)
  • Fragmentation of a table
    • Horizontal: divide rows (selection operation)
    • ...Fig. 25.7, pp. 823 using 2 sites
    • Vertical: divide columns (project operation)
    • Mixed: divide both rows and columns
  • Replication degrees
    • Non-redundant replication - each fragment at one site only
    • Full replication - all site have all data
    • selective replication -
    • Q? Determine replication degree for each fragment in example.
  • Allocation
    • Assign each copy of each fragment to a site


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: Physical DM: Query processing
  • Processing Query over distributed databases
    • 25.4.2 data tranfer strategies
    • ...Communication cost besides CPU cost, I/O cost
    • 25.4.3 Query decomposition and allocation
  • Example (Fig. 25.6 pp. 820)
    • Data at site 1 and site 2, Query at site 3
      select fname, lname, dname from employee E, department D
      where E.dno = D.dnumber
  • Data transfer strategies - conventional
    • Send both tables to site 3
    • Send table E to site 2 and then result to site 3
    • Send table D to site 1 and then result to site 3
  • Semi-join data tranfer strategy -
    • Site 2: Send T1 = project(D, dnumber) to site 1
    • Site 1: T2 = project ( join(D1, E), dnumber, fname, lname)
    • ...and send T2 back to site 1
    • Site 2: T3 = join(T2, D)
    • ...Site 2: send final result = project(T3, ...) to site 3
  • semi-join(R, S, A=B) = project(columns of R, join(R, S, A=B) )
    • Q? Is semi-join(A, B, A=B) < > semi-join(S, R, A=B)?
    • When is semi-join strategy cheaper than conventional strategies?
    • ...tables are large but semi-join result is small
    • ...lots on irrelevant columns, very small join selectivity


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: Physical DM: Query processing
  • 25.4.3 Query decomposition and allocation
    • Goal: support data distribution transparency
    • Task: divide query according to data fragmentation
  • Example
    • Horizontal data fragmentation - Fig. 25.7 pp. 823
    • Query Q (bottom of pp. 824)
    • decomposition of Q (pp. 824)
  • A Simple Method based on query rewriting
    • Assume: non-redundant replication
    • Assume: federated schema gives view definition for tables
    • ...it may involve joins among vertical fragment and
    • ...union among horizontal fragments
    • 1. Rewrite query: replace tables by their view definition
    • ...New query has fragments in "from"/"where" clauses
    • 2. Prune fragment list in "from" clause
    • ...Omit fragment F if it can not contribute tuples to final result
    • ...Vertical F: columns(F) intersect column(select clause) = null
    • ...Horizontal F: contradiction(guard(F), selection_condition(Q))
    • 3. Generate execution plan for pruned query
    • ...Tranfer strategy choices for each operation in query tree
    • ...choice 1: bring all fragments to one site
    • ...choice 2: semi-join transfer strategy for join operation


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S10: OLTP : 25.5 Distributed Concurrency Control
  • Q? Identify new OLTP issues in
    • Distributed Locking
    • Distributed Deadlock Detection
  • Locking for replicated, fragmented records
    • How to manage locks for objects across sites?
    • Centralized - one site does all locking.
    • Primary Copy - locking of an object done at
    • ...site holding primary copy of the object
    • Fully distributed - locking for a copy done at
    • ...site holding specific copy of the object
    • ...Write_lock - must lock all copies
  • Deadlock Detection - Construction of wait-for graph
    • Each site maintains local wait-for graph
    • Global deadlock may occur even if acyclic local graphs
    • Solutions:
    • ...Centralized - collect local WFGs at one site
    • ...Time-out - abort Xact if it waits too long for a lock


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S11: OLTP : 25.5.3 Distributed Recovery
  • New Issues
    • New kinds of failures - link, remote sites
    • Distributed transactions - updates data at many sites
  • Assumptions
    • Logs are maintained at each site as before
    • New commit protocol for transaction T
    • Coordinator C = site where T originated from
    • Subordinate sites Si = site where T executes
    • ...S = {S1, S2, ... Si, ... }
  • Two phase commit protocol for commit(T)
    • C sends prepare to each Si
    • Si : EITHER { force write prepare log record ; send yes to C}
    • ... OR { force write abort log record ; send no to C }
    • C : if all Si sent yes then
    • ...{ force write commit log record ; send commit to each Si }
    • ...else { force write abort log record ; send abort to each Si }
    • Si : if msg from C is commit then
    • ...{ force write commit log record ; send ack to C}
    • ... else { force write abort log record ; send ack to C }
    • C : if all Si sent ack then write end log record
  • Properties of Two phase commit protocol
    • 2 phases of communications : voting, termination
    • Any site can decide to abort a Xact in first phase
    • Every message reflects a decision by the sender
    • ...decision is recorded in log for fault-recovery
    • Si's 2PC log records have Xact-id and coordinator-id
    • C's 2PC log records have Xact-id and Si-ids for all Si


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S12: OLTP : Distributed Recovery
  • Q? What happens for following failures:
    • A. if C fails during 2PC
    • B. if Si or link to Si fails during 2PC?
    • C. if C fails between sending prepare
    • ...and writing commit/abort log
    • A. Transaction T is blocked waiting for C
    • B. Phase I : C can abort T, Phase II: depends !
    • C. C aborts T during restart
  • Recover during Restart after a failure at Si
    • If log has prepare(T) but not commit(T)/abort(T)
    • ...contact C to find status of T,
    • ...write commit/abort log record, redo/undo T
    • ...write end(T) log record
    • If log has commit(T)/abort(T) but not end(T)
    • ...redo/undo T based on consensus decision in Phase I
    • ...send ack to T
  • Recover during Restart after a failure at C
    • If log has no prepare(T) = >
    • ...abort T and undo its effect
    • If log has prepare(T) but not commit(T)/abort(T)
    • ...can abort T, send abor(T) to Si
    • If log has commit(T)/abort(T) but not end(T)
    • ...(re)send commit/abort msgs to Si to get acks
    • ...redo/undo T based on consensus decision in Phase I
    • ...watch for message from Si about T
  • Properties fo recovery algorithm
    • Ack message let C know when it forget T
    • ...C track T until all acks are received
    • If T does not update data at Si
    • ...then its comit/abort status is irrelevant


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)