Vol. III · Issue 06 · May 2026ISSN 2814-9921
Vector Search At Scale
PDF·EPUB·Lifetime updates
Machine Learning Systems · 1st Edition · January 2026

Vector Search At Scale

ANN indexes, recall budgets, and the database under your RAG

4.7(134 ratings)
intermediate
296 pages

RAG demos are easy. RAG that serves 10k QPS without melting your bill is engineering. This manual covers HNSW vs IVF vs DiskANN tradeoffs, recall budgets, hybrid search with BM25, sharding strategies that survive index rebuilds, and the failure modes specific to filtered ANN queries. Includes benchmarks across Qdrant, Weaviate, pgvector, and a from-scratch HNSW implementation.

Dr. Priya Anand
Author
Dr. Priya Anand
ML Platform Engineer

Priya runs the kind of ML platforms where a 200ms regression costs more than your annual cloud bill. Her writing focuses on the boring infrastructure that makes models actually serve traffic.

$19.99
Instant PDF + EPUB delivery
DRM-free, copy onto any device
Free chapter updates for the life of the edition
View cart
Specifications
Pages
296
Edition
1st Edition
Language
English
Level
intermediate
ISBN
978-1-99999-010-4
Published
January 2026
Editorial review

Reviewed by three working engineers at peer publications before publication. We do not publish first drafts.

Table of contents

What you'll find inside.

  1. 01The ANN Landscape
  2. 02HNSW From Scratch
  3. 03IVF and PQ Tradeoffs
  4. 04DiskANN for Bigger Indexes
  5. 05Recall vs Latency Curves
  6. 06Hybrid Search That Actually Helps
  7. 07Sharding and Rebuilds
  8. 08Filtered ANN: The Hard Problem
Also in this section

More from Machine Learning Systems