> I would love to learn from your work. I appreciate this opportunity.
> 
> Here are some things I already know. R-trees store ranges in each
> dimensions, e.g., rectangles or boxes in 2-D or 3-D, whereas Z-order B-trees
> (or Hilbert or any other space-filling curve) store points. R-trees have
> tricky search algorithms & cost functions because one might need to search
> multiple descendents from any given internal tree node, since ranges
> represented by child nodes might overlap. Splitting nodes can be quite
> tricky because the overlap must be minimized heuristically. Also, building
> an R-tree for a large pre-existing data set is tricky, whereas it is well
> known how to sort & append the entries in B-trees.
> 
> Hilbert is slightly more effective that Z because it doesn't have any
> diagonal lines, but I think the math is more complex, and computing
> precisely the ranges where the space filling curve enters and exits the
> query box is more tricky. In fact, I have worked the entry & exit points out
> for Z curves using a method different than the one patented by Bayer and
> Markl.
> 
> One can map ranges in N dimensions to points in twice as many (2N)
> dimensions. Doing so will result in a situation in which only half the
> domain is used. For example, if I have a 1-D space and want to store ranges,
> I need a 2-D Z-curve. Presuming that the (0,0) point is at the bottom right,
> and presuming that a 1-D range is mapped to a 2-D point by making the left
> end of the range the x-value in the 2-D space, there will be a diagonal
> through the indexed space from bottom left to top right. None of the ranges
> will map to point above left of this diagonal, since for all ranges in 1-D
> the left end point has a lower value than the right end point.
> 
> In order to execute a "contains" query, i.e., search for rectangles that are
> fully contained within a query range, say (lo, hi), I map the query range to
> two points in 2-D, namely (x=lo, y=hi) and (x=hi, y=lo), and then search for
> data in the square defined by these two points. This is a fairly efficient
> operation in a Z-order index. For range intersection, I search the rectangle
> defined by (x=hi, y=lo) and the bottom right corner of the 2-D space, again
> fairly efficiently. If I want to, I can limit the search by a line parallel
> to the diagonal discussed above, I think, but I am not sure about that yet.
> 
> For nearest-neighbor problems, one has to define an initial bounding box (a
> square) and search there. Actually, only the largest circle within the
> square can be used. If this search doesn't find a result, e.g., because all
> points in the square fail to meet some additional criterion (e.g., I am only
> looking for Italian restaurants, not for all possible tourist facilities),
> then I need to consider a larger bounding box (square). Within that box, I
> must exclude element considered within the earlier circle, and I can
> actually omit to fetch entries from the index that fall into an exclusion
> box, i.e., the largest square within the earlier circle.
> 
> Here is the key question we need to answer. Is it sufficient to implement
> Z-order B-trees only, or are R-trees truly required? What optimizer smarts
> are required for Z-order B-trees? What overall performance loss will we need
> to tolerate, i.e., possibly explain to potential customers?
> 
> A second question is this: Do we need transformations on the dimensions? For
> example, if all actual data point fall to the left, maybe I should scale the
> horizontal axis such that Z shapes are narrow on the left but wide on the
> right. I am quite sure that only linear and very simple non-linear
> transformations are realistically possible in a product. I am also quite
> sure that each transformation would apply to one dimension only, i.e.,
> transformations would orthogonal to each other, because otherwise, a
> rectangular query box would not be rectangular after the transformation.
> 
> If you have any ideas or suggestions or advice on these key questions, I
> would love to learn more from you. I greatly appreciate your help.
> 
> Thank you,
> Goetz
>  
> 
> 
> -----Original Message-----
> From: Chang-tien Lu [  <mailto:ctlu@cs.umn.edu> mailto:ctlu@cs.umn.edu]
> Sent: Tuesday, November 14, 2000 8:31 PM
> To: Goetz Graefe
> Subject: R-Tree v.s. B-Tree + Z-order
> 
> 
> Dear Dr. Graefe:
> 
>     First of all, I would like to thank you
> for the treat at the Old Spaghetti Factory yesterday.
> Right now, I am working with Prof. Shashi Shekhar
> on the spatial database group, and have done some
> work on join-index based spatial join processing
> as in the following link:
>  <http://www-users.cs.umn.edu/~ctlu/papertalk.html>
> http://www-users.cs.umn.edu/~ctlu/papertalk.html
> 
>     We are interested in the comparison between R-tree
> and Z-order + B-tree, or related spatial join processing.
> Could you give us some suggestion from your work
> at Microsoft SQL server?
> 
> 
>     Thank you very much!
> 
> 
>     Chang-tien Lu
> 
> 
> 
> 
> 
> 
>