Abstract
The nearest neighbor problem is that of preprocessing a set P of n data points in Rd so that, given any query point q, the closest point in P to q can be determined efficiently. In the chromatic nearest neighbor problem, each point of P is assigned a color, and the problem is to determine the color of the nearest point to the query point. More generally, given k≥1, the problem is to determine the color occurring most frequently among the k nearest neighbors. The chromatic version of the nearest neighbor problem is used in many applications in pattern recognition and learning. In this paper we present a simple algorithm for solving the chromatic k nearest neighbor problem. We provide a query sensitive analysis, which shows that if the color classes form spatially well separated clusters (as often happens in practice), then queries can be answered quite efficiently. We also allow the user to specify an error bound ε≥0, and consider the same problem in the context of approximate nearest neighbor searching. We present empirical evidence that for well clustered data sets, this approach leads to significant improvements in efficiency.
Original language | English |
---|---|
Pages (from-to) | 97-119 |
Number of pages | 23 |
Journal | Computational Geometry: Theory and Applications |
Volume | 17 |
Issue number | 3-4 |
DOIs | |
State | Published - Dec 2000 |
Bibliographical note
Funding Information:I A preliminary version of this paper appeared in the Proceedings of the 7th Canadian Conference on Computational Geometry, 1995, pp. 261–266. *Corresponding author. E-mail address: mount@cs.umd.edu (D.M. Mount). 1The support of the National Science Foundation under grant CCR-9310705 is gratefully acknowledged. 2This research was carried out while the author was also affiliated with the Center of Excellence in Space Data and Information Sciences at NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA.
Keywords
- BBD trees
- Branch-and-bound search
- Chromatic nearest neighbors
- Classification algorithms
- Multidimensional searching
- Pattern recognition
- Query sensitive analysis