Mining Valid-Time Indeterminate Events

Date of Submission: 
March 27, 2006
Report Number: 
Report PDF: 
In many temporally oriented applications, it is known that events have occurred but the exact time when an event has occurred is not known. For example, a blood test of a diabetic patient may yield information that the patient's blood glucose level is above the safe threshold but may not exactly tell when that has happened. Such temporal events are said to have valid-time indeterminacy, where the exact time of occurrence of an event is not known. Extensions to SQL for supporting valid-time indeterminacy in temporal databases have been studied. However, no prior research has been done on applying mining techniques for finding interesting patterns from valid-time indeterminate events. Thus, in this paper, we first provide a background on temporal valid-time indeterminacy. We then propose a measure, "ordering probability", for computing the probability of occurrence of an episode (ordered list of items) in the given temporal sequence of indeterminate events. The bounds for this measure are shown and then the anti-monotonic and asymmetric properties of this measure are proved. Mining of frequent patterns from indeterminate events will require computation of this measure for different sequences, hence an efficient algorithm for computing the ordering probability measure for a given episode in a sequence is proposed. Finally, the use of this measure in two temporal data mining frameworks, namely (i) sequence mining, and (ii) sequential pattern mining, are explained. The extensions of the frequency of an episode in sequence mining, and support for an episode in sequential pattern mining are shown. The research is this paper thus generalizes the research in temporal data mining to allow valid-time indeterminacy.