S1: Highlights of Database Systems
- Goals:
- Provide guidance on exam.
- Areas of problem solving, discussion questions
- Outline
- Main Concepts
- Problem Solving
- Important Terms
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: File Structures
- Main Concepts
- Secondary storage characterisitcs
- File organization - heap, sorted, hased
- Linear and Extedible Hashing
- Problem Solving
- Given some ofthe following, determine the rest:
- ...record size, block size, blocking factor, file size
- ...number of records, blocking factor, spanned/unspanned
- Given file organization parameters,
- ...determine I/O cost of search and updae operations
- Given a sequence of key values
- ...draw linear hashing or extendible hashing states
- Important Terms
- Secondary storage characterisitcs
- ...Disks, sector, block, track, cylinder, platter
- ...seek, rotational latency, transfer times
- ...RAIDs, records, blocks, files
- File organization - heap, sorted, hased
- Linear and Extedible Hashing
- File of mixed records
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: Index Structures
- Main Concepts
- Primary, secondary, clustering indexes
- B-tree, B+tree indexes
- Problem Solving
- Given a set of key-values and order of trees,
- ...draw B-tree or B+tree with different utilization
- Given a sequence of key-values and order of trees,
- ...draw a resulting sequence of B-tree or B+tree
- Important Terms
- Primary, secondary, clustering indexes
- ...key field, ordering field, dense index, sparse index,
- B-tree, B+tree indexes
- ...search trees of order p, balanced tree, average fan-out
- ...tree poitner, data pointer, thread,
- ...search, insert, delete operations
- Indexes on multiple keys - partitioned hash, Grid files,
- Logical vs. Physical Indexes
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: Query Processing
- Main Concepts
- Strategies for selection and join
- Algebraic optimization (tree transforms)
- Cost based optimization
- Problem Solving
- Given a SQL query, draw canonical query tree
- ...and perform algebraic optimizations
- Given a simple SQL query and relevant parameters,
- ...estimate costs of alternative execution plans
- Important Terms
- parser, optimizer, execution
- Query blocks, nested query
- Transformation rules for relational algebra operations
- Strategies for point query: scan, primary/secondary index
- ...... for conjunctive point query: 1 index, 2 indexes
- ... for range query: scan, primary index, secondary index scan
- ... for equi-join: nested loop, sort merge, hybrid hash
- Selectivity, join selectivity
- Cost models for strategies
- Join orders, cost based optimization
- Oracle hints
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Transactions
- Main Concepts
- ACID properties
- Serializability of schedules
- Problem Solving
- Given a set of transactions,
- ...count or generate possible schedules
- Given a schedule, draw conflict-graph
- ...and check conflict-serializability
- Important Terms
- Transaction, states, commit point, Schedules
- Recoverability properties - strict, cascadeless
- Schedule equivalence: conflict, result, view
- Testing conflict serializability
- Transactions in SQL, Isolation level
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: Concurrency Control
- Main Concepts
- Conflict Graph Scheduler (CGS)
- Two phase locking (2PL)
- Timestamp ordering (TSO)
- Problem Solving
- Given an input schedule, a scheduler (e.g. CGS, 2PL, TSO)
- ...determine output schedule
- Important Terms
- locks (binary, read/write), lock granularity
- 2PL: basic, conservative, strict
- ...dealing with deadlocks, cascaded rollbacks
- Time stamps for transactions, data-items
- TSO: basic, strict, Thomas's write rule
- Index - tree locking protocols, B-link trees
- Phantom problem, predicate locking
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: Recovery
- Main Concepts
- Commit
- Log based Recovery
- Shadow paging based Recovery
- Problem Solving
- Given a log, determine actions of log based
- ...recovery schemes (scenarios: strict or non-strict schedules)
- ...accounting for cascaded rollbacks
- Important Terms
- log (redo, undo), checkpoints,
- write ahead logging, steal/no-steal, force/no-force
- Transaction rollbacks, cascading rollbacks
- Recovery algorithms (log based, shadow paging)
- ...database update: deferred, immediate (anytime)
- Write ahead loggin rules
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: Data Warehouses
- Main Concepts
- Data warehouse and its operations e.g. rollup, pivot
- Data Cube model, SQL cube and rollup operations
- Problem Solving
- Given a base (fact) table, compute result of
- ...rollup operation along a dimension
- ...or the Cube operation
- Given a base (fact) table, domains of its columns
- ...Draw a bitmap index for a column
- ...Estimate bitmap index size, I/O cost of CNF selection query
- ...Compare cost of a bitmap index and a B+Tree index
- Important Terms
- Star schema, fact table, dimension tables
- rollup, drill down, pivot, slice/dice, cube
- bitmap index, join index, star join strategy
- materialized views
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: Data Mining
- Main Concepts
- Pattern families - classification, clustering, associations
- Association Rules, apriori algorithm
- Problem Solving
- Given a set of pattern families and a set of DM application
- ...match pattern families to applications
- Given a set of items, a set of transactions, support threshold,
- ...find associations with high support and
- ...association rules (LHS, RHS, support, confidence)
- Important Terms
- Data mining, machine learning, statistics
- patterns: association, classification, clustering
- associations: transactions, items,
- ...support, confidence, association rules
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S10: Distributed Databases
- Main Concepts
- Semi-join data transfer strategy
- Two phase commit algorithm
- Problem Solving
- Given a distributed database and a simple query
- ...compute data transfer cost of different strategies
- ...compute data transfer cost of different strategies
- Semi-join operation and its algebra
- ...A. compute result of a given semi-join
- ...B. If a transformation rule (chapter 18.3.2) holds
- ...if join is replaced by semi-join
- Important Terms
- homogenity, autonomy
- distribution transparent, multi-database, federated
- fragments (vertical, horizontal), replication
- Query decomposition
- Distributed locking
- ...primary copy solutions, fully distributed solutions
- Distributed deadlock detection
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S11: Database Security
- Main Concepts
- SQL grant statement
- Bell LaPadula model
- Statistical inference control
- Problem Solving
- A. Given a set of SQL grant/revoke statements,
- ...determine if a SQL DML (e.g. select) statement is permissible
- ...draw corresponding authorization graph (sequence)
- B. Given Bell LaPadula model components
- ...(eg security classes of subjects and objects),
- ...determine if certain operatios are permissible
- Important Terms
- access matrix [subject, object] :- > previlege
- ...previlege - select, modify, references
- ...grant (grant option), revoke (cascade)
- Views, base tables
- Subject, object, security classes,
- ...poly-instantiation
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S12: Object Relational Databases
- Main Concepts
- Limitations of Relational Models
- Impact of object model on relational concepts (e.g. NFs)
- Impact of relation concepts on object model (e.g. encapsulation)
- Problem Solving
- A. Given a user requirements,
- ...Use group-by to aggregate information in SQL2 queries
- ...identify relevant ADT (with operations) to simplify SQL3 queries
- B. Given a set of SQL3 expressions and a set of SQL2 queries
- ...for a SQL3 schema and its corresponding 3NF relational schema,
- ...Match SQL3 expressions to SQL2 expressions
- Important Terms
- Object-relation DBMS, Object-oriented DBMS
- User-defined types (UDT), Abstract data types (ADT), operations
- Objects, classes, inheritance, polymorphism
- oids, collection types, reference types, BLOBs
- Extended Entity Relationship Models
- SQL3, OQL, path expression, dot notation,
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S13: XML Databases
- Main Concepts
- Graph data dodel of semi-structured XML data
- Querying semi-structured data via XQuery or SQL
- Problem Solving
- A. Given a XML document (and possibly its schema)
- ...design a 3NF relational schema to store its contents
- ...write simple FLWR expression to query its subset
- B. Given a FLWR query for a XML document (or possibly its schema)
- ...write SQL expression for the query
- C. Given a set of FLWR expressions and a set of SQL queries
- ...for a XML document (and its 3NF relational schema)
- ...Match SQL expressions to FLWR expressions
- Important Terms
- W3C, XML, XPath, XQuery, Path expression, FLWR expression,
- Semi-structured data,
- OEM graph data model, label, type, value, oid
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)