A Graphical Interface to Clustering Algorithms and Visualizations

Date of Submission: 
May 25, 2004
Report Number: 
04-020
Report PDF: 
Abstract: 
Due to recent advances in information technology, there has been an enormous growth in the amount of data generated in fields ranging from business to the sciences. This data has the potential to greatly benefit its owners by increasing their understanding of their own work. However, the growing size and complexity of data has introduced new challenges in extracting its meaning. To address these challenges, many data mining techniques have been developed. One technique in particular, clustering, has been successful in a wide range of applications. Clustering solves the general problem of identifying groups of related objects. Depending on the application, these objects may represent customers, documents, molecules, or genes. The ability to handle such diverse data in a general way has led to the popularity of clustering algorithms. In this paper, we introduce gCluto, a stand alone clustering software package designed to ease the use of clustering algorithms and their results. gCluto offers improvements over existing tools with features such as an intuitive graphical user interface, interactive visualizations, and mechanisms for comparing multiple clustering solutions. In addition to introducing the tool, the underlining algorithms and design decisions of gCluto will also be presented.