Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis
- URL: http://arxiv.org/abs/2602.11443v1
- Date: Wed, 11 Feb 2026 23:40:26 GMT
- Title: Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis
- Authors: Abylay Amanbayev, Brian Tsan, Tri Dang, Florin Rusu,
- Abstract summary: Filtered Approximate Nearest Neighbor Search (FANNS) is used to combine semantic retrieval with metadata constraints.<n>In this work, we systematize the taxonomy of filtering strategies and evaluate their integration into vectors.<n>Our experiments reveal that algorithmic adaptations within the engine often override raw index performance.
- Score: 0.5249805590164902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-Augmented Generation (RAG) applications increasingly rely on Filtered Approximate Nearest Neighbor Search (FANNS) to combine semantic retrieval with metadata constraints. While algorithmic innovations for FANNS have been proposed, there remains a lack of understanding regarding how generic filtering strategies perform within Vector Databases. In this work, we systematize the taxonomy of filtering strategies and evaluate their integration into FAISS, Milvus, and pgvector. To provide a robust benchmarking framework, we introduce a new relational dataset, \textit{MoReVec}, consisting of two tables, featuring 768-dimensional text embeddings and a rich schema of metadata attributes. We further propose the \textit{Global-Local Selectivity (GLS)} correlation metric to quantify the relationship between filters and query vectors. Our experiments reveal that algorithmic adaptations within the engine often override raw index performance. Specifically, we find that: (1) \textit{Milvus} achieves superior recall stability through hybrid approximate/exact execution; (2) \textit{pgvector}'s cost-based query optimizer frequently selects suboptimal execution plans, favoring approximate index scans even when exact sequential scans would yield perfect recall at comparable latency; and (3) partition-based indexes (IVFFlat) outperform graph-based indexes (HNSW) for low-selectivity queries. To facilitate this analysis, we extend the widely-used \textit{ANN-Benchmarks} to support filtered vector search and make it available online. Finally, we synthesize our findings into a set of practical guidelines for selecting index types and configuring query optimizers for hybrid search workloads.
Related papers
- JAG: Joint Attribute Graphs for Filtered Nearest Neighbor Search [8.398505388767248]
JAG (Joint Attribute Graphs) is a graph-based algorithm designed to deliver robust performance across the entire selectivity spectrum.<n>Our experimental results demonstrate that JAG significantly outperforms existing state-of-the-art baselines in both throughput and recall robustness.
arXiv Detail & Related papers (2026-02-10T20:00:51Z) - Curator: Efficient Vector Search with Low-Selectivity Filters [12.774238654446032]
Graph-based indexes achieve state-of-the-art performance on unfiltered ANNS but encounter connectivity breakdown on low-selectivity filtered queries.<n>Recent research proposes specialized graph indexes that address this issue by expanding graph degree.<n>We present Curator, a partition-based index that complements existing graph-based approaches for low-selectivity filtered ANNS.
arXiv Detail & Related papers (2026-01-03T21:35:01Z) - Rethinking On-policy Optimization for Query Augmentation [49.87723664806526]
We present the first systematic comparison of prompting-based and RL-based query augmentation across diverse benchmarks.<n>We introduce a novel hybrid method, On-policy Pseudo-document Query Expansion (OPQE), which learns to generate a pseudo-document that maximizes retrieval performance.
arXiv Detail & Related papers (2025-10-20T04:16:28Z) - Benchmarking Filtered Approximate Nearest Neighbor Search Algorithms on Transformer-based Embedding Vectors [18.796661826646616]
Filtered Approximate Nearest Neighbor Search is a problem known as Filtered Approximate Nearest Neighbor Search (FANNS)<n>We present a comprehensive survey and taxonomy of FANNS methods and analyze how they are benchmarked in the literature.<n>We introduce a novel dataset consisting of embedding vectors for the abstracts of over 2.7 million research articles from the arXiv repository, accompanied by 11 real-world attributes such as authors and categories.
arXiv Detail & Related papers (2025-07-29T16:39:54Z) - NaviX: A Native Vector Index Design for Graph DBMSs With Robust Predicate-Agnostic Search Performance [7.108581652658526]
We present NaviX, a native vector index for graphs (GDBMSs)<n> NaviX is built on the Hierarchical Navigable Small-World (HNSW) graph, which itself is a graph-based structure.
arXiv Detail & Related papers (2025-06-29T21:16:07Z) - Billion-scale Similarity Search Using a Hybrid Indexing Approach with Advanced Filtering [49.1574468325115]
This paper presents a novel approach for similarity search with complex filtering capabilities on billion-scale datasets, optimized for CPU inference.<n>Our method extends the classical IVF-Flat index structure to integrate multi-dimensional filters.<n>The proposed algorithm combines dense embeddings with discrete filtering attributes, enabling fast retrieval in high-dimensional spaces.
arXiv Detail & Related papers (2025-01-23T07:47:00Z) - Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? [62.57689536630933]
We provide experimental results on the BEIR dataset using the open-source Lucene search library.
Our results provide guidance for today's search practitioner in understanding the design space of dense and sparse retrievers.
arXiv Detail & Related papers (2024-09-10T12:46:23Z) - Efficient Data Access Paths for Mixed Vector-Relational Search [8.80592433569832]
Machine learning and the adoption of data processing methods using vector embeddings sparked a great interest in creating systems for vector data management.
While the predominant approach of vector data management is to use specialized index structures for fast search over the entirety of the vector embeddings, once combined with other (meta)data, the search queries can also become selective on relational attributes.
As using vector indexes differs from traditional relational data access, we revisit and analyze alternative access paths for efficient mixed vector-relational search.
arXiv Detail & Related papers (2024-03-23T11:34:17Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - Reinforcement Learning Based Query Vertex Ordering Model for Subgraph
Matching [58.39970828272366]
Subgraph matching algorithms enumerate all is embeddings of a query graph in a data graph G.
matching order plays a critical role in time efficiency of these backtracking based subgraph matching algorithms.
In this paper, for the first time we apply the Reinforcement Learning (RL) and Graph Neural Networks (GNNs) techniques to generate the high-quality matching order for subgraph matching algorithms.
arXiv Detail & Related papers (2022-01-25T00:10:03Z) - The Case for Learned Spatial Indexes [62.88514422115702]
We use techniques proposed from a state-of-the art learned multi-dimensional index structure (namely, Flood) to answer spatial range queries.
We show that (i) machine learned search within a partition is faster by 11.79% to 39.51% than binary search when using filtering on one dimension.
We also refine using machine learned indexes is 1.23x to 1.83x times faster than closest competitor which filters on two dimensions.
arXiv Detail & Related papers (2020-08-24T12:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.