Discovering and Quantifying Mean Streets: A Summary of Results

Date of Submission: 
October 25, 2007
Report Number: 
Report PDF: 
Mean streets represent those connected subsets of a spatial network whose attribute values are significantly higher than expected. Discovering and quantifying mean streets is an important problem with many applications such as detecting high-crime-density streets and high crash roads (or areas) for public safety, detecting urban cancer disease clusters for public health, detecting human activity patterns in asymmetric warfare scenarios, and detecting urban activity centers for consumer applications. However, discovering and quantifying mean streets in large spatial networks is computationally very expensive due to the difficulty of characterizing and enumerating the population of streets to define a norm or expected activity level. Previous work either focuses on statistical rigor at the cost of computational exorbitance, or concentrates on computational efficiency without addressing any statistical interpretation of algorithms. In contrast, this paper explores computationally efficient algorithms for use on statistically interpretable results. We describe alternative ways of defining and efficiently enumerating instances of subgraph families such as paths. We also use statistical models such as the Poisson distribution and the sum of independent Poisson distributions to provide interpretations for results. We define the problem of discovering and quantifying mean streets and propose a novel mean streets mining algorithm. Experimental evaluations using synthetic and real-world datasets show that the proposed method is computationally more efficient than nave alternatives.