S1: Spatial Access Methods
Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: 4.1 Storage: Disks and Files
  • Secondary Storage - Disks
    • hardware: platters, arms, spindle, cache (?)
    • data layout: sectors, blocks, tracks, cylinders
    • access time: seek , rotational delay, transfer
    • avg. access time is 10-15 msec vs. microsec for memory
    • ... sequential I/O vs. random I/O
  • File (Table) organization
    • Field, Records, Files
    • Blocking factor = number of Records per disk block
    • Blobs for large fields (e.g. polygon, image)
    • May factor out large fields in separate file
    • ... to keep blocking factor high


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: Access Methods
  • General Issues
    • 2 types: file-structure, separate index file
    • Access Method Interface (AMI):
    • ... open, close, search, bulk load, insert, delete, update
    • Functionality - partition records into buckets
  • Evaluation is function of queries not data
    • point/ range query, spatial queries (e.g. nearest neighbor)
    • ... a sample of spatial queries on pp. 119
    • unbuffered (cold) vs. buffered (hot) searches
    • Metrics: # of I/Os to answer queries
    • Implementation issues: concurrency & recovery,
    • ... cost estimation model for query processing strategies


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: 4.2 Spatial Indexing Overview
  • Clustering
    • Space Filling Curves
  • Access Methods
    • B+ tree with a space filling curve
    • Grid Files
    • R-trees
    • Hundreds of variations
  • Spatial Queries from GIS, VLSI, Data Mining, ...
    • OGIS predicates: inside, Overlap, Contains, Nearest ...
    • Ex. of Point, Range and Nearest neighbor queries
    • ... Get name of a city with given point coordinates
    • ... Get all cities in given area
    • ... Find nearest city to a given point


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Clustering
  • Levels of clustering
    • A. Intra-object
    • B. Within a disk page or a segment
    • C. Across disk pages and or segments
  • Relational solutions for 1-d keys
    • A. OODBMS clusters objects in a part-hierarchy
    • B. Sorting records
    • C. Contiguous allocation by tracks, cylinders
  • New Issues in Spatial Databases
    • B. Sorting -- > space filling curves (e.g. Z, Hilbert)
    • ... captures about half neighborhood relationships
    • ... natural for point data, point queries
    • ... line/polygons -- > MOBR -- > points in 4-d space
    • ... rectangle query -- > a set of range queries
    • ... Nearest neighbor query not explored.


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: B+trees Basics
  • Query: point, range queries, 1-dim keys
  • B tree data-structure
    • tree: balanced, high fanout, page utilization (50 percent)
    • Pointer to data only in leaf nodes
    • Compression for keys, pointers
    • B+tree: threaded leaf pages for range queries
  • Concurrency & Recovery issues
    • level: leaf, internal node (sub-tree), root (tree)
    • predicate locking
    • Solution similar to R-tree (Section 4.3.1)
    • high concurrency locking: need not be 2PL


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: B+trees with a space filling curve
  • Query: point, range queries, N-dim keys
  • Two component to indexing spatial datasets
    • A space filling curve (e.g. Z) orders spatial keys
    • Keys in B+tree are Z-values for spatial keys
  • Processing Point Query - Fast
    • Compute z-order of query point
    • Search B+tree with z-order value
  • Processing Range Query [Orenstein] - Tedious
    • Compute set of z-order ranges inside query rectangle
    • Each z-order range is a range query on B+tree
  • Processing Nearest Neighbor Query
    • Z-order based may not effective (?)
    • If MOBR(child node) is kept with keys then
    • ... adapt algorithm for R-tree [Rossopolous]


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: R-trees
  • Spatial queries (e.g. point/ range queries in N dim. )
  • Data Structure: like B-tree
    • But keys are MOBR(child), one key per child
    • MOBR(child) inside MOBR(parent)
  • Performance study
    • RISC-II VLSI data set (Guttman 84)
    • Sequoia 2000 (Postgres & SHORE) w/ M/2, quadratic split
    • Content based retrieval, data mining
  • Point/Range Search
    • Recurse if overlap( MOBR(child), query point/MOBR)
    • Traverse multiple paths! (DFS)
  • Q? Will the algo. work for all OGIS operators?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: R-trees - Spatial Query Processing
  • OGIS Predicates - inside, touch, overlap, ...
    • Overlap is a nice filter for many topological predicates
    • Eact predicate checked in refine step
  • General strategy for other OGIS predicates
    • child J eliminates child I iff
    • ... some condition C is satisfied
    • Recurse on childs which can not be eliminated
  • Nearest Neighbor Search for point QP
    • C = min-dist(MOBR(I), QP) > max-dist(MOBR(J), QP)
    • min-dist(rectangle R, QP) = distance(QP, closest point in R)
    • min-dist(rectangle R, QP) = distance(QP, farthest point in R)
    • ... consider corner point and sides of rectangle R
  • Q? Will general strategy work for all OGIS predicates?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S10: Update Processing with R-tree
  • Insertion: Where: ChooseLeaf. Least enlargement.
    • Ties broken by picking smallest key.
    • Node splitting as in B+-tree.
    • Q? what goes in each partition?
    • AdjustTree required. Why not in B+-tree?
  • Deletion: Seek and destroy
    • CondenseTree: throw entries from under-full pages
    • ... back into tree while preserving level.
    • Why different from B+-tree?
  • Update: delete, update, reinsert
  • Split algorithms: exhaustive
    • quadratic: start with most distant seeds,
    • ... greedily add based on maximal preference
    • linear: like quadratic, but picks seeds differently


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S11: R-tree Variants
  • R*-tree - several optimizations
    • Faster split algorithm (n * logn)
    • Reduces sibling overlap, Smaller MOBR for parents
    • Improves page utilization, reduces splits
    • Out-performs R-tree on point/range search.
    • But Concurrency problems (reinsertion of extreme entries).
  • R+-Tree
    • no sibling overlap
    • But replicate items across leafs
    • Faster search but slow insert/delete/update
    • R-trees partition data. R+-trees partition space.
    • ... like K-D-B trees, Quad trees


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S12: Grid Files
  • Spatial queries (e.g. point/ range queries in N dim. )
  • Data Structure: Grid directory
    • N dimension vectors: cross-product is non-uniform grid
    • Grid directory : cell to data page
    • ... many cells may share a data page
    • Grid directory can be quite large!
  • Point Search
    • Binary search of each dimension to identify a cell
    • Fetch grid-cell and then the data-page pointed to
    • 2 I/Os for poit query (very fast)
  • Insert of a point key
    • Binary search of each dimension to identify a cell
    • If the page pointed has no page then split
    • ... insert a whole new row or column
    • ... induces more sharing of pages across cells


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S13: Grid Files
  • Q? How does Grid Files deal with extended objects?
    • ? Replicate across cells (pages)
    • ? Decompose large objects into sub-objects
  • Range Search
    • Binary search of each dimension - > a set of cells
    • Duplicate elimination on pages pointed to
    • Fetch unique data-page pointed to
    • #I/Os proportional to result size (very fast)
  • Nearest Neighbor Search
    • Extend R-tree algorithms on the unique pages in grid
  • Q? Will the algo. work for all OGIS operators?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S14: 4.3 Related Topics
  • Topics
    • Concurrency Control
    • Join Index
    • TR* tree for object decomposition


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)