Similarity-based Time-Profiled Association Mining: A Summary of Results

Date of Submission: 
December 30, 2005
Report Number: 
05-043
Report PDF: 
Abstract: 
Given a time-stamped transaction database, time-profiled associations represent those subsets of items whose support time sequence satisfies a given specification. Considering the specification to be the similarity with the El Nino index sequence, the decreased fish-catch in Peru is an example of time-profiled associations. Classical association mining uses simple numeric interest measures (e.g., support) and a simple subset specification such that support is greater than a certain threshold. In contrast, time-profiled association mining uses a composite interest measure, i.e., a sequence of supports over a sequence of time slots and can use a richer subset specification, such as time series correlation and Euclidean distance. Mining time-profiled associations is computationally challenging due to the large size of datasets, composite interest measures, and richer subset specification. In this paper, we propose a monotonic lower bounding of Lp norm based similarity measure, and propose two novel algorithms to mine time-profiled associations. We show that the proposed algorithms are correct and complete in finding time-profiled associations. Experimental results show that the proposed algorithms outperform the naive approach.