S1: Distributed Databases
- Learning Goals:
- Understand Distributed Databases
- New environment, new challenges
- Solutions: New features in database server
- Outline
- Distributed Databases - motivation, challenges,
- Conceptual DM: heterogenity, autonomy
- Logical DM: links
- Physical DM: storage, query processing
- OLTP : recovery, concurrency control
- Readings
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: Distributed Databases - introduction
- PUT IN MORE OF INTERNET, MOBILE COMPUTING
- Motivation
- Distributed computing - LAN, WAN, internet
- Opportunities to improve
- ...Availability, Reliability via redundancy
- ...Performance via division of work
- Computing Environment (Fig. 25.1, pp. 806)
- Criteria - relative homogenity of nodes
- ...Homogeneous or parallel: shared nothing / memory /disk
- ...Heterogeneous: centralized, truly distributed
- Challenges
- Conceptual - schema heterogenity, autonomy
- Logical - transparency, SQL heterogencity
- Physical - replication, fragmentation, allocation
- ...communication costs
- OLTP - site and communication failures
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: Conceptual DM: heterogenity, autonomy
- 25.3 Classification Criteria
- A. Degree of homogenity
- ...Choices of Data models, constraints, query languages
- ...Semantic - attributes of an entity, attribute names,
- ...meaning of an attribute, isolation level choices, ...
- B. Degree of local autonomy
- ...Data model design autonomy
- ...Communication autonomy
- ...Execution autonomy
- ...Association autonomy
- 25.3 : Categories of Distributed Databases
- A. Distribution Transparent - homogeneous, no local autonomy
- ...looks like a single centralized DB to users (Fig. 25.1(b))
- B. Multidatabase - no common schema
- C. Federated - common schema, but local autonomy
- ...hardware and software may be heterogeneous
- Schema hierarchy in federated databases (Fig. 25.5 , pp. 819)
- local, component (translation to canonical data model)
- ...export (public view), federated (global view to DBAs)
- ...external (view to user group)
- Federated schema built with consensus
- Data interchange standards complement federated schema
- XML: syntax for federated schema and data in federated schema
- ...has momentum with support of major IT vendors
- ...e.g. MathML, OFX (financial data), etc.
- ...Imagine TurboTax fetching your W-2s, 1099s, ...
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: Conceptual DM: Schema Integration Example
- Engineering Database - Relational Schema
E(eno, ename, title), p.k. = eno
J(jno, jname, budget, loc, cname), p.k. = jno
G(eno, jno, resp, dur), p.k. = eno, jno
S(title, sal), p.k. = title
- Employee Database - CODASYL Schema
Department(dept-name, budget, manager)
Employee(e#, name, address, title, salary)
Department Employs Employee (1:N relationship)
- Database III - Entity Relationship Model
- Entities
Engineer(Engineer No, name, title, salary)
Project(PNo, project name, budget, location)
Client(Client Name, Address)$
- Relationships (N : 1)
Engineer Works_In Project : (Responsibility, Duration)
Project Contract_By Client : (Contract Date)
- Q? Design an integrated schema for the common database.
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Conceptual DM: Schema Integration Example
- One Solution (Entity Relationship Diagram)
- Entities
Department(dname, budget, manager)
Employee(E#, name, title, address, salary)
Engineer()
Project(PNo, Pname, Budget, Location)
Client(Client Name, Address)$
- Relationships
Engineer() is a subtype of Employee
Department Employs Employee (1:N)
Engineer Works_In Project (M:N) : (Responsibility, Duration)
Project Contracted by Clients (M:N) : (Contract Date)
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: Logical DM: links
- Source: Section 25.7 (Oracle SQL)
- Global Naming of data objects (e.g. tables):
- < tablename.@databasename > , e.g. emp@sales
- Database Links for one way communication
- database (e.g. sales) from domain (e.g. us.america)
create databsae link sales.us.america ;
- Database Replication
- Read-only : replicated master table
- Read-Write : snapshot with periodic refresh
create snapshot sales.orders
as select * from sales.order@hq.us.americas;
- Q? Do these facilities supported conceptual DM needs?
- ...e.g. federated database management ?
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: Physical DM: storage
- 25.2 Storage - Fragmentation, Allocation, Replication
- Example - central data (Fig. 5.5-6, pp. 136-7)
- ...distributed data (Fig. 25.3, 25.4, pp. 814-5)
- Fragmentation of a table
- Horizontal: divide rows (selection operation)
- ...Fig. 25.7, pp. 823 using 2 sites
- Vertical: divide columns (project operation)
- Mixed: divide both rows and columns
- Replication degrees
- Non-redundant replication - each fragment at one site only
- Full replication - all site have all data
- selective replication -
- Q? Determine replication degree for each fragment in example.
- Allocation
- Assign each copy of each fragment to a site
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: Physical DM: Query processing
- Processing Query over distributed databases
- 25.4.2 data tranfer strategies
- ...Communication cost besides CPU cost, I/O cost
- 25.4.3 Query decomposition and allocation
- Example (Fig. 25.6 pp. 820)
- Data at site 1 and site 2, Query at site 3
select fname, lname, dname from employee E, department D
where E.dno = D.dnumber
- Data transfer strategies - conventional
- Send both tables to site 3
- Send table E to site 2 and then result to site 3
- Send table D to site 1 and then result to site 3
- Semi-join data tranfer strategy -
- Site 2: Send T1 = project(D, dnumber) to site 1
- Site 1: T2 = project ( join(D1, E), dnumber, fname, lname)
- ...and send T2 back to site 1
- Site 2: T3 = join(T2, D)
- ...Site 2: send final result = project(T3, ...) to site 3
- semi-join(R, S, A=B) = project(columns of R, join(R, S, A=B) )
- Q? Is semi-join(A, B, A=B) < > semi-join(S, R, A=B)?
- When is semi-join strategy cheaper than conventional strategies?
- ...tables are large but semi-join result is small
- ...lots on irrelevant columns, very small join selectivity
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: Physical DM: Query processing
- 25.4.3 Query decomposition and allocation
- Goal: support data distribution transparency
- Task: divide query according to data fragmentation
- Example
- Horizontal data fragmentation - Fig. 25.7 pp. 823
- Query Q (bottom of pp. 824)
- decomposition of Q (pp. 824)
- A Simple Method based on query rewriting
- Assume: non-redundant replication
- Assume: federated schema gives view definition for tables
- ...it may involve joins among vertical fragment and
- ...union among horizontal fragments
- 1. Rewrite query: replace tables by their view definition
- ...New query has fragments in "from"/"where" clauses
- 2. Prune fragment list in "from" clause
- ...Omit fragment F if it can not contribute tuples to final result
- ...Vertical F: columns(F) intersect column(select clause) = null
- ...Horizontal F: contradiction(guard(F), selection_condition(Q))
- 3. Generate execution plan for pruned query
- ...Tranfer strategy choices for each operation in query tree
- ...choice 1: bring all fragments to one site
- ...choice 2: semi-join transfer strategy for join operation
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S10: OLTP : 25.5 Distributed Concurrency Control
- Q? Identify new OLTP issues in
- Distributed Locking
- Distributed Deadlock Detection
- Locking for replicated, fragmented records
- How to manage locks for objects across sites?
- Centralized - one site does all locking.
- Primary Copy - locking of an object done at
- ...site holding primary copy of the object
- Fully distributed - locking for a copy done at
- ...site holding specific copy of the object
- ...Write_lock - must lock all copies
- Deadlock Detection - Construction of wait-for graph
- Each site maintains local wait-for graph
- Global deadlock may occur even if acyclic local graphs
- Solutions:
- ...Centralized - collect local WFGs at one site
- ...Time-out - abort Xact if it waits too long for a lock
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S11: OLTP : 25.5.3 Distributed Recovery
- New Issues
- New kinds of failures - link, remote sites
- Distributed transactions - updates data at many sites
- Assumptions
- Logs are maintained at each site as before
- New commit protocol for transaction T
- Coordinator C = site where T originated from
- Subordinate sites Si = site where T executes
- ...S = {S1, S2, ... Si, ... }
- Two phase commit protocol for commit(T)
- C sends prepare to each Si
- Si : EITHER { force write prepare log record ; send yes to C}
- ... OR { force write abort log record ; send no to C }
- C : if all Si sent yes then
- ...{ force write commit log record ; send commit to each Si }
- ...else { force write abort log record ; send abort to each Si }
- Si : if msg from C is commit then
- ...{ force write commit log record ; send ack to C}
- ... else { force write abort log record ; send ack to C }
- C : if all Si sent ack then write end log record
- Properties of Two phase commit protocol
- 2 phases of communications : voting, termination
- Any site can decide to abort a Xact in first phase
- Every message reflects a decision by the sender
- ...decision is recorded in log for fault-recovery
- Si's 2PC log records have Xact-id and coordinator-id
- C's 2PC log records have Xact-id and Si-ids for all Si
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S12: OLTP : Distributed Recovery
- Q? What happens for following failures:
- A. if C fails during 2PC
- B. if Si or link to Si fails during 2PC?
- C. if C fails between sending prepare
- ...and writing commit/abort log
- A. Transaction T is blocked waiting for C
- B. Phase I : C can abort T, Phase II: depends !
- C. C aborts T during restart
- Recover during Restart after a failure at Si
- If log has prepare(T) but not commit(T)/abort(T)
- ...contact C to find status of T,
- ...write commit/abort log record, redo/undo T
- ...write end(T) log record
- If log has commit(T)/abort(T) but not end(T)
- ...redo/undo T based on consensus decision in Phase I
- ...send ack to T
- Recover during Restart after a failure at C
- If log has no prepare(T) = >
- ...abort T and undo its effect
- If log has prepare(T) but not commit(T)/abort(T)
- ...can abort T, send abor(T) to Si
- If log has commit(T)/abort(T) but not end(T)
- ...(re)send commit/abort msgs to Si to get acks
- ...redo/undo T based on consensus decision in Phase I
- ...watch for message from Si about T
- Properties fo recovery algorithm
- Ack message let C know when it forget T
- ...C track T until all acks are received
- If T does not update data at Si
- ...then its comit/abort status is irrelevant
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)