Compass: General Filtered Search across Vector and Structured Data
- URL: http://arxiv.org/abs/2510.27141v2
- Date: Sat, 08 Nov 2025 03:29:25 GMT
- Title: Compass: General Filtered Search across Vector and Structured Data
- Authors: Chunxiao Ye, Xiao Yan, Eric Lo,
- Abstract summary: This work introduces textsc, a unified framework that enables general filtered search across vector and structured data.<n> Compass maintains generality by allowing arbitrary conjunctions, disjunctions, and range predicates.<n> Compass consistently outperforms NaviX, the only existing performant general framework.
- Score: 4.006306763185373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing prevalence of hybrid vector and relational data necessitates efficient, general support for queries that combine high-dimensional vector search with complex relational filtering. However, existing filtered search solutions are fundamentally limited by specialized indices, which restrict arbitrary filtering and hinder integration with general-purpose DBMSs. This work introduces \textsc{Compass}, a unified framework that enables general filtered search across vector and structured data without relying on new index designs. Compass leverages established index structures -- such as HNSW and IVF for vector attributes, and B+-trees for relational attributes -- implementing a principled cooperative query execution strategy that coordinates candidate generation and predicate evaluation across modalities. Uniquely, Compass maintains generality by allowing arbitrary conjunctions, disjunctions, and range predicates, while ensuring robustness even with highly-selective or multi-attribute filters. Comprehensive empirical evaluations demonstrate that Compass consistently outperforms NaviX, the only existing performant general framework, across diverse hybrid query workloads. It also matches the query throughput of specialized single-attribute indices in their favorite settings with only a single attribute involved, all while maintaining full generality and DBMS compatibility. Overall, Compass offers a practical and robust solution for achieving truly general filtered search in vector database systems.
Related papers
- Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis [0.5249805590164902]
Filtered Approximate Nearest Neighbor Search (FANNS) is used to combine semantic retrieval with metadata constraints.<n>In this work, we systematize the taxonomy of filtering strategies and evaluate their integration into vectors.<n>Our experiments reveal that algorithmic adaptations within the engine often override raw index performance.
arXiv Detail & Related papers (2026-02-11T23:40:26Z) - The Hybrid Multimodal Graph Index (HMGI): A Comprehensive Framework for Integrated Relational and Vector Search [6.821769033209393]
This paper introduces the Hybrid Multimodal Graph Index (HMGI), a novel framework designed to bridge the gap between vector databases and graph databases.<n>By integrating semantic similarity search directly with relational context, HMGI aims to outperform pure vector databases in complex, relational-heavy query scenarios.
arXiv Detail & Related papers (2025-10-11T09:06:26Z) - HyST: LLM-Powered Hybrid Retrieval over Semi-Structured Tabular Data [0.4779196219827507]
HyST (Hybrid retrieval over Semi-structured Tabular data) is a hybrid retrieval framework that combines structured filtering with semantic embedding search.<n>We show that HyST consistently outperforms tradtional baselines on a semi-structured benchmark.
arXiv Detail & Related papers (2025-08-25T14:06:27Z) - Chain of Retrieval: Multi-Aspect Iterative Search Expansion and Post-Order Search Aggregation for Full Paper Retrieval [68.71038700559195]
Chain of Retrieval(COR) is a novel iterative framework for full-paper retrieval.<n>We present SCIBENCH, a benchmark providing both complete and segmented contexts of full papers for queries and candidates.
arXiv Detail & Related papers (2025-07-14T08:41:53Z) - NaviX: A Native Vector Index Design for Graph DBMSs With Robust Predicate-Agnostic Search Performance [7.108581652658526]
We present NaviX, a native vector index for graphs (GDBMSs)<n> NaviX is built on the Hierarchical Navigable Small-World (HNSW) graph, which itself is a graph-based structure.
arXiv Detail & Related papers (2025-06-29T21:16:07Z) - ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels.<n>Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches.<n>Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z) - CART: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
Cross-modal retrieval aims to search for instances, which are semantically related to the query through the interaction of different modal data.<n>Traditional solutions utilize a single-tower or dual-tower framework to explicitly compute the score between queries and candidates.<n>We propose a generative cross-modal retrieval framework (CART) based on coarse-to-fine semantic modeling.
arXiv Detail & Related papers (2024-06-25T12:47:04Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - Generative Retrieval as Multi-Vector Dense Retrieval [71.75503049199897]
Generative retrieval generates identifiers of relevant documents in an end-to-end manner.
Prior work has demonstrated that generative retrieval with atomic identifiers is equivalent to single-vector dense retrieval.
We show that generative retrieval and multi-vector dense retrieval share the same framework for measuring the relevance to a query of a document.
arXiv Detail & Related papers (2024-03-31T13:29:43Z) - Efficient Data Access Paths for Mixed Vector-Relational Search [8.80592433569832]
Machine learning and the adoption of data processing methods using vector embeddings sparked a great interest in creating systems for vector data management.
While the predominant approach of vector data management is to use specialized index structures for fast search over the entirety of the vector embeddings, once combined with other (meta)data, the search queries can also become selective on relational attributes.
As using vector indexes differs from traditional relational data access, we revisit and analyze alternative access paths for efficient mixed vector-relational search.
arXiv Detail & Related papers (2024-03-23T11:34:17Z) - Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [59.57065228857247]
Retrieval-augmented Large Language Models (LLMs) have emerged as a promising approach to enhancing response accuracy in several tasks, such as Question-Answering (QA)
We propose a novel adaptive QA framework, that can dynamically select the most suitable strategy for (retrieval-augmented) LLMs based on the query complexity.
We validate our model on a set of open-domain QA datasets, covering multiple query complexities, and show that ours enhances the overall efficiency and accuracy of QA systems.
arXiv Detail & Related papers (2024-03-21T13:52:30Z) - Navigable Proximity Graph-Driven Native Hybrid Queries with Structured
and Unstructured Constraints [10.842138336245384]
We propose a native hybrid query (NHQ) framework based on proximity graph (PG), which provides the specialized textitcomposite index and joint pruning modules for hybrid queries.
We present two novel navigable PGs with optimized edge selection and routing strategies, which obtain better overall performance than existing PGs.
arXiv Detail & Related papers (2022-03-25T12:02:37Z) - The Stereotyping Problem in Collaboratively Filtered Recommender Systems [77.56225819389773]
We show that matrix factorization-based collaborative filtering algorithms induce a kind of stereotyping.
If preferences for a textitset of items are anti-correlated in the general user population, then those items may not be recommended together to a user.
We propose an alternative modelling fix, which is designed to capture the diverse multiple interests of each user.
arXiv Detail & Related papers (2021-06-23T18:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.