PANENE

A Progressive Algorithm for Indexing and Querying Approximate k-Nearest Neighbors

Jaemin Jo, Jinwook Seo, and Jean-Daniel Fekete / 2018

responsive t-SNE

Participants

Jaemin Jo, Seoul National University, Seoul, Korea
Jinwook Seo, Seoul National University, Seoul, Korea
Jean-Daniel Fekete, Inria, France

Abstract

We present PANENE, a progressive algorithm for approximate nearest neighbor indexing and querying. Although the use of k-nearest neighbor (KNN) libraries is common in many data analysis methods, most KNN algorithms can only be queried when the whole dataset has been indexed, i.e., they are not online. Even the few online implementations are not progressive in the sense that the time to index incoming data is not bounded and cannot satisfy the latency requirements of progressive systems. This long latency has significantly limited the use of many machine learning methods, such as t-SNE, in interactive visual analytics. PANENE is a novel algorithm for Progressive Approximate k-NEarest NEighbors, enabling fast KNN queries while continuously indexing new batches of data. Following the progressive computation paradigm, PANENE operations can be bounded in time, allowing analysts to access running results within an interactive latency. PANENE can also incrementally build and maintain a cache data structure, a KNN lookup table, to enable constant-time lookups for KNN queries. Finally, we present three progressive applications of PANENE, such as regression, density estimation, and responsive t-SNE, opening up new opportunities to use complex algorithms in interactive systems

Source Code

https://github.com/e-/PANENE

Publications

Jaemin Jo, Jinwook Seo, and Jean-Daniel Fekete, PANENE: A Progressive Algorithm for Indexing and Querying Approximate k-Nearest Neighbors, [PDF], IEEE Transactions on Visualization and Computer Graphics (TVCG)