Speaker: Harsha Simhadri
Location: Soda 430-438, Woz Lounge
Date: May 12, 2023
Time: 12-1pm PST
Title: DiskANN: A Library for Web-Scale Vector Search Systems
Web-scale search, recommendation and generative AI scenarios increasingly use Vector search or Approximate Nearest Neighbor Search (ANNS) indices to retrieve semantically similar objects based on the distance of their learnt representations in a geometric space. Since these scenarios often span billions or trillions of objects, efficient and scalable ANNS algorithms are critical to making these systems practical. However, most algorithms studied in literature either focus on small million-scale datasets or lack features critical for practical deployments, e.g., external memory indices or support for real-time updates.
In this talk, we present the DiskANN library which is widely deployed at Microsoft at costs order(s) of magnitude lower than existing products. This includes the first external memory ANNS algorithm that can index a billion points –an order of magnitude larger than previous work — on a commodity machine with <64GB DRAM and serve queries at latencies of a few milliseconds. The library allows real-time updates and its in-memory performance compares well with other state of the art indices. The library also supports features critical to deployment such as hybrid queries that combine vector search and hard matches such as language or author. DiskANN and its adaptations proved to be the state of the art on both standard and specialized hardware in the Billion-scale ANNS challenge at NeurIPS’21.
Finally, we will highlight some open problems in building on DiskANN to develop a “Vector Database”. A few of these include support for replication, crash recovery and serializability.
Joint work with Ravishankar Krishnaswamy, Suhas J Subramanya, Aditi Singh, Rohan Kadekodi, Devvrit, Shikhar Jaiswal, Magdalen Dobson, Siddharth Gollapudi, Neel Karia, Varun Sivasankaran.
Harsha Simhadri is a Principal Researcher at Microsoft Research. He enjoys developing new algorithms to build extreme scale practical systems. Recent examples include algorithms for web-scale nearest-neighbor search deployed widely in Microsoft, and new ML operators and architectures for tiny IoT and edge devices. He previously worked on parallel algorithms and run-times with provable guarantees for multi-core processors for his PhD thesis at Carnegie Mellon University.