Cascading spatio-temporal pattern discovery
Date of Submission:
May 2, 2011
Given a collection of Boolean spatio-temporal (ST) event-types, the cascading spatio-temporal pattern (CSTP) discovery process finds partially ordered subsets of these event-types whose instances are located together and occur serially. For example, analysis of crime datasets may reveal frequent occurrence of misdemeanors and drunk driving after and near bar closings on weekends, as well as after and near large gatherings such as football games. Discovering CSTPs from ST datasets is important for application domains such as public safety (e.g. identifying crime attractors and generators) and natural disaster planning(e.g. preparing for hurricanes). However, CSTP discovery presents multiple challenges; three important ones are (1) the exponential cardinality of candidate patterns with respect to the number of event types, (2) computationally complex ST neighborhood enumeration required to evaluate the interest measure and (3) the difficulty of balancing computational complexity and statistical interpretation. Current approaches for ST data mining focus on mining totally ordered sequences or unordered subsets. In contrast, our recent work explores partially ordered patterns. Recently, we represented CSTPs as directed acyclic graphs; proposed a new interest measure, the cascade participation index; outlined the general structure of a cascading spatio-temporal pattern miner (CSTPM); evaluated filtering strategies to enhance computational savings using a real world crime dataset and proposed a nested loop based CSTPM to address the challenge posed by exponential cardinality of candidate patterns. This paper adds to our recent work by offering a new computational insight, namely, that the computational bottleneck for CSTP discovery lies in the interest measure evaluation. With this insight, we propose a new CSTPM based on spatio-temporal partitioning that significantly lowers the cost of interest measure evaluation. Analytical evaluation shows that our new CSTPM is correct and complete. Results from significant amount of new experimental evaluation with both synthetic and real data show that our new ST partitioning based CSTPM outperforms the CSTPM from our previous work. We also present a case study that verifies the applicability of CSTP discovery process.