A visual analytics tool for interactively comparing multiple clustering results of bioinformatics data
Sehi L'Yi, Bongkyung Ko, DongHwa Shin, Young-Joon Cho, Jaeyong Lee, Bohyoung Kim, and Jinwook Seo / 2015
- Sehi L'Yi, Seoul National University, Seoul, Republic of Korea
- Bongkyung Ko, Seoul National University, Seoul, Republic of Korea
- DongHwa Shin, Seoul National University, Seoul, Republic of Korea
- Young-Joon Cho, ChunLab, Inc., Seoul, Republic of Korea
- Jaeyong Lee, Seoul National University, Seoul, Republic of Korea
- Bohyoung Kim, Hankuk University of Foreign Studies, Yongin-si, Republic of Korea
- Jinwook Seo, Seoul National University, Seoul, Republic of Korea
- Background: Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them. However, such a comparison task with multiple clustering results is cognitively demanding and laborious.
- Results: In this paper, we present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results based on the Visual Information Seeking Mantra. We build a taxonomy for categorizing existing techniques of clustering results visualization in terms of the Gestalt principles of grouping. Using the taxonomy, we choose the most appropriate interactive visualizations for presenting individual clustering results from different types of clustering algorithms. The efficacy of XCluSim is shown through case studies with a bioinformatician.
- Conclusions: Compared to other relevant tools, XCluSim enables users to compare multiple clustering results in a more scalable manner. Moreover, XCluSim supports diverse clustering algorithms and dedicated visualizations and interactions for different types of clustering results, allowing more effective exploration of details on demand. Through case studies with a bioinformatics researcher, we received positive feedback on the functionalities of XCluSim, including its ability to help identify stably clustered items across multiple clustering results.
- For the software availability, we are now improving the stability of XCluSim. As soon as we finish modifying the software, we will upload it in this page with a user manual and test datasets. If you are interested in trying a current beta version of XCluSim (0.8.1v), please email to email@example.com.
- 0.8.1v (12/26/2016): Stability of import and export features are improved and these features are now enabled.
This work and publication was partly supported by the National Research Foundation of Korea (NRF) grants funded by the Korea government of MSIP (No. NRF-2014R1A2A2A03006998) and by the Korea government of MEST (No. 2011-0030813). BK (Kim) was supported by the Seoul National University Bundang Hospital Research Fund (No. 12-2013-017).
- Sehi L'Yi, Bongkyung Ko, DongHwa Shin, Young-Joon Cho, Jaeyong Lee, Bohyoung Kim, and Jinwook Seo, XCluSim: A visual analytics tool for interactively comparing multiple clustering results of bioinformatics data, [PDF], BMC Bioinformatics and 5th Symposium on Biological Data Visualization (BioVis '15)