Two-Dimensional Association Analysis For Finding Constant Value Biclusters In Real-Valued Data
Date of Submission:
July 7, 2009
Biclustering is a commonly used type of analysis for real-valued data sets, and several algorithms have been proposed for finding different types of biclusters. However, no systematic approach has been proposed for exhaustive enumerating all (nearly) constant value biclusters in such data sets, which is the problem addressed in this paper. Using a monotonic range measure to capture the coherence of values in a block/submatrix of an input data matrix, we propose a two-step Apriori-based algorithm for discovering all nearly constant value biclusters, referred to as Range Constrained Blocks (RCBs). By systematic evaluation on an extensive genetic interaction data set, we show that the submatrices with similar values represent groups of genes that are functionally related than the biclusters with diverse values. We also show that our approach can exhaustively find all the biclusters with a range less than a given threshold, while the other competing approaches can not find all such biclusters.