Indexing in large scale image collections: Scaling properties and benchmark
Indexing quickly and accurately in a large collection of images has become an important problem with many applications. Given a query image, the goal is to retrieve matching images in the collection. We compare the structure and properties of seven different methods based on the two leading approaches: voting from matching of local descriptors vs. matching histograms of visual words, including some new methods. We derive theoretical estimates of how the memory and computational cost scale with the number of images in the database. We evaluate these properties empirically on four real-world datasets with different statistics. We discuss the pros and cons of the different methods and suggest promising directions for future research.