GREW - A Scalable Frequent Subgraph Discovery Algorithm

Date of Submission: 
June 22, 2004
Report Number: 
04-024
Report PDF: 
Abstract: 
Existing algorithms that mine graph datasets to discover patterns corresponding to frequently occurring subgraphs can operate efficiently on graphs that are sparse, contain a large number of relatively small connected components, have vertices with low and bounded degrees, and contain well-labeled vertices and edges. However, there are a number of applications that lead to graphs that do not share these characteristics, for which these algorithms highly become unscalable. In this paper we propose a heuristic algorithm called GREW to overcome the limitations of existing complete or heuristic frequent subgraph discovery algorithms. GREW is designed to operate on a large graph and to find patterns corresponding to connected subgraphs that have a large number of vertex-disjoint embeddings. Our experimental evaluation shows that GREW is efficient, can scale to very large graphs, and find non-trivial patterns that cover large portions of the input graph and the lattice of frequent patterns.