A Python Library for Evaluating the Reliability of Dimensionality Reduction Embeddings
Hyeon Jeon, Aeri Cho, Jinhwa Jang, Soohyun Lee, Jake Hyun, Hyung-Kwon Ko, Jaemin Jo, and Jinwook Seo / 2023
PARTICIPANTS
- Hyeon Jeon, Seoul Nationl University
- Aeri Cho, Seoul National University
- Jinhwa Jang, Seoul National Unviersity, Samsung Electronics
- SooHyun Lee, Seoul National University
- Jake Hyun, Seoul National University
- Hyung-Kwon Ko, KAIST
- Jaemin Jo, Sungkyunkwan University
- Jinwook Seo, Seoul National University
ABSTRACT
Dimensionality reduction (DR) techniques inherently distort the original structure of input high-dimensional data, producing imperfect low-dimensional embeddings. Thus, diverse distortion measures have been proposed to evaluate the reliability of DR embeddings. However, implementing and executing distortion measures in practice has so far been time-consuming and tedious. To address this issue, we present ZADU, a Python library that is easy to install and execute while also enabling comprehensive evaluation of DR embeddings through three key features. First, the library covers a wide range of distortion measures. Second, it automatically optimizes the execution of distortion measures, substantially reducing the running time required to execute multiple measures. Last, the library informs how individual points contribute to the overall distortions, facilitating the detailed analysis of DR embeddings. By simulating a real-world scenario of optimizing DR embeddings, we verify that the execution scheduling substantially reduces the time required to execute distortion measures. Finally, as an application of ZADU, we present another library called ZADUVis that allows users to easily create distortion visualizations that depict the extent to which each region of an embedding suffers from distortions.
Supplemental Materials
- to appear