S1: Spatial Access Methods
- 4.1 Storage: Disks and Files
- Secondary Storage - Disks
- Field, Records, Files
- 4.2 Spatial Indexing Overview
- Clustering
- Grid Files
- R-trees
- 4.3 Related Topics
- Concurrency Control
- Join Index
- TR* tree for object decomposition
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: 4.1 Storage: Disks and Files
- Secondary Storage - Disks
- hardware: platters, arms, spindle, cache (?)
- data layout: sectors, blocks, tracks, cylinders
- access time: seek , rotational delay, transfer
- avg. access time is 10-15 msec vs. microsec for memory
- ... sequential I/O vs. random I/O
- File (Table) organization
- Field, Records, Files
- Blocking factor = number of Records per disk block
- Blobs for large fields (e.g. polygon, image)
- May factor out large fields in separate file
- ... to keep blocking factor high
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: Access Methods
- General Issues
- 2 types: file-structure, separate index file
- Access Method Interface (AMI):
- ... open, close, search, bulk load, insert, delete, update
- Functionality - partition records into buckets
- Evaluation is function of queries not data
- point/ range query, spatial queries (e.g. nearest neighbor)
- ... a sample of spatial queries on pp. 119
- unbuffered (cold) vs. buffered (hot) searches
- Metrics: # of I/Os to answer queries
- Implementation issues: concurrency & recovery,
- ... cost estimation model for query processing strategies
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: 4.2 Spatial Indexing Overview
- Clustering
- Access Methods
- B+ tree with a space filling curve
- Grid Files
- R-trees
- Hundreds of variations
- Spatial Queries from GIS, VLSI, Data Mining, ...
- OGIS predicates: inside, Overlap, Contains, Nearest ...
- Ex. of Point, Range and Nearest neighbor queries
- ... Get name of a city with given point coordinates
- ... Get all cities in given area
- ... Find nearest city to a given point
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Clustering
- Levels of clustering
- A. Intra-object
- B. Within a disk page or a segment
- C. Across disk pages and or segments
- Relational solutions for 1-d keys
- A. OODBMS clusters objects in a part-hierarchy
- B. Sorting records
- C. Contiguous allocation by tracks, cylinders
- New Issues in Spatial Databases
- B. Sorting -- > space filling curves (e.g. Z, Hilbert)
- ... captures about half neighborhood relationships
- ... natural for point data, point queries
- ... line/polygons -- > MOBR -- > points in 4-d space
- ... rectangle query -- > a set of range queries
- ... Nearest neighbor query not explored.
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: B+trees Basics
- Query: point, range queries, 1-dim keys
- B tree data-structure
- tree: balanced, high fanout, page utilization (50 percent)
- Pointer to data only in leaf nodes
- Compression for keys, pointers
- B+tree: threaded leaf pages for range queries
- Concurrency & Recovery issues
- level: leaf, internal node (sub-tree), root (tree)
- predicate locking
- Solution similar to R-tree (Section 4.3.1)
- high concurrency locking: need not be 2PL
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: B+trees with a space filling curve
- Query: point, range queries, N-dim keys
- Two component to indexing spatial datasets
- A space filling curve (e.g. Z) orders spatial keys
- Keys in B+tree are Z-values for spatial keys
- Processing Point Query - Fast
- Compute z-order of query point
- Search B+tree with z-order value
- Processing Range Query [Orenstein] - Tedious
- Compute set of z-order ranges inside query rectangle
- Each z-order range is a range query on B+tree
- Processing Nearest Neighbor Query
- Z-order based may not effective (?)
- If MOBR(child node) is kept with keys then
- ... adapt algorithm for R-tree [Rossopolous]
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: R-trees
- Spatial queries (e.g. point/ range queries in N dim. )
- Data Structure: like B-tree
- But keys are MOBR(child), one key per child
- MOBR(child) inside MOBR(parent)
- Performance study
- RISC-II VLSI data set (Guttman 84)
- Sequoia 2000 (Postgres & SHORE) w/ M/2, quadratic split
- Content based retrieval, data mining
- Point/Range Search
- Recurse if overlap( MOBR(child), query point/MOBR)
- Traverse multiple paths! (DFS)
- Q? Will the algo. work for all OGIS operators?
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: R-trees - Spatial Query Processing
- OGIS Predicates - inside, touch, overlap, ...
- Overlap is a nice filter for many topological predicates
- Eact predicate checked in refine step
- General strategy for other OGIS predicates
- child J eliminates child I iff
- ... some condition C is satisfied
- Recurse on childs which can not be eliminated
- Nearest Neighbor Search for point QP
- C = min-dist(MOBR(I), QP) > max-dist(MOBR(J), QP)
- min-dist(rectangle R, QP) = distance(QP, closest point in R)
- min-dist(rectangle R, QP) = distance(QP, farthest point in R)
- ... consider corner point and sides of rectangle R
- Q? Will general strategy work for all OGIS predicates?
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S10: Update Processing with R-tree
- Insertion: Where: ChooseLeaf. Least enlargement.
- Ties broken by picking smallest key.
- Node splitting as in B+-tree.
- Q? what goes in each partition?
- AdjustTree required. Why not in B+-tree?
- Deletion: Seek and destroy
- CondenseTree: throw entries from under-full pages
- ... back into tree while preserving level.
- Why different from B+-tree?
- Update: delete, update, reinsert
- Split algorithms: exhaustive
- quadratic: start with most distant seeds,
- ... greedily add based on maximal preference
- linear: like quadratic, but picks seeds differently
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S11: R-tree Variants
- R*-tree - several optimizations
- Faster split algorithm (n * logn)
- Reduces sibling overlap, Smaller MOBR for parents
- Improves page utilization, reduces splits
- Out-performs R-tree on point/range search.
- But Concurrency problems (reinsertion of extreme entries).
- R+-Tree
- no sibling overlap
- But replicate items across leafs
- Faster search but slow insert/delete/update
- R-trees partition data. R+-trees partition space.
- ... like K-D-B trees, Quad trees
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S12: Grid Files
- Spatial queries (e.g. point/ range queries in N dim. )
- Data Structure: Grid directory
- N dimension vectors: cross-product is non-uniform grid
- Grid directory : cell to data page
- ... many cells may share a data page
- Grid directory can be quite large!
- Point Search
- Binary search of each dimension to identify a cell
- Fetch grid-cell and then the data-page pointed to
- 2 I/Os for poit query (very fast)
- Insert of a point key
- Binary search of each dimension to identify a cell
- If the page pointed has no page then split
- ... insert a whole new row or column
- ... induces more sharing of pages across cells
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S13: Grid Files
- Q? How does Grid Files deal with extended objects?
- ? Replicate across cells (pages)
- ? Decompose large objects into sub-objects
- Range Search
- Binary search of each dimension - > a set of cells
- Duplicate elimination on pages pointed to
- Fetch unique data-page pointed to
- #I/Os proportional to result size (very fast)
- Nearest Neighbor Search
- Extend R-tree algorithms on the unique pages in grid
- Q? Will the algo. work for all OGIS operators?
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S14: 4.3 Related Topics
- Topics
- Concurrency Control
- Join Index
- TR* tree for object decomposition
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)