Vextra: A Unified Middleware Abstraction for Heterogeneous Vector Database Systems
- URL: http://arxiv.org/abs/2601.06727v1
- Date: Sun, 11 Jan 2026 00:35:35 GMT
- Title: Vextra: A Unified Middleware Abstraction for Heterogeneous Vector Database Systems
- Authors: Chandan Suri, Gursifath Bhasin,
- Abstract summary: This paper introduces Vextra, a novel abstraction layer designed to address API fragmentation.<n>Vextra presents a unified, high-level API for core database operations, including data upsertion, similarity search, and metadata filtering.<n>It employs a pluggable adapter architecture to translate these unified API calls into the native protocols of various backend databases.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid integration of vector search into AI applications, particularly for Retrieval Augmented Generation (RAG), has catalyzed the emergence of a diverse ecosystem of specialized vector databases. While this innovation offers a rich choice of features and performance characteristics, it has simultaneously introduced a significant challenge: severe API fragmentation. Developers face a landscape of disparate, proprietary, and often volatile API contracts, which hinders application portability, increases maintenance overhead, and leads to vendor lock-in. This paper introduces Vextra, a novel middleware abstraction layer designed to address this fragmentation. Vextra presents a unified, high-level API for core database operations, including data upsertion, similarity search, and metadata filtering. It employs a pluggable adapter architecture to translate these unified API calls into the native protocols of various backend databases. We argue that such an abstraction layer is a critical step towards maturing the vector database ecosystem, fostering interoperability, and enabling higher-level query optimization, while imposing minimal performance overhead.
Related papers
- UniPAR: A Unified Framework for Pedestrian Attribute Recognition [14.613498516126498]
We propose UniPAR, a unified Transformer-based framework for Pedestrian Attribute Recognition.<n>By incorporating a unified data scheduling strategy and a dynamic classification head, UniPAR enables a single model to simultaneously process diverse datasets.<n> Experimental results on the widely used benchmark datasets, including MSP60K, DukeMTMC, and EventPAR, demonstrate that UniPAR achieves performance comparable to specialized SOTA methods.
arXiv Detail & Related papers (2026-03-05T12:34:35Z) - Stacked from One: Multi-Scale Self-Injection for Context Window Extension [69.24689919827817]
modelname is a novel framework based on multi-grained context compression and query-aware information acquisition.<n>modelnameachieves performance superior or comparable to strong baselines.
arXiv Detail & Related papers (2026-03-05T03:16:16Z) - Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition [53.50448142467294]
RAIM is a multi-design and architecture-aware framework for repository-level feature addition.<n>It shifts away from linear patching by generating multiple diverse implementation designs.<n>Experiments on the NoCode-bench Verified dataset demonstrate that RAIM establishes a new state-of-the-art performance.
arXiv Detail & Related papers (2026-03-02T12:50:40Z) - AgentSkiller: Scaling Generalist Agent Intelligence through Semantically Integrated Cross-Domain Data Synthesis [30.512393568258105]
Large Language Model agents demonstrate potential in solving real-world problems via tools, yet generalist intelligence is bottlenecked by scarce high-quality, long-horizon data.<n>We propose AgentSkiller, a fully automated framework synthesizing multi-turn interaction data across realistic, semantically linked domains.
arXiv Detail & Related papers (2026-02-10T03:21:42Z) - Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation [50.22481337087162]
Referring Video Object (RVOS) aims to segment objects in videos based on textual queries.<n>Refer-Agent is a collaborative multi-agent system with alternating reasoning-reflection mechanisms.
arXiv Detail & Related papers (2026-02-03T14:48:12Z) - SPAR: Session-based Pipeline for Adaptive Retrieval on Legacy File Systems [6.5637131627375505]
SPAR (Session-based Pipeline for Adaptive Retrieval) is a conceptual framework that integrates Large Language Models into a Retrieval-Augmented Generation (RAG) architecture specifically designed for legacy enterprise environments.<n>Unlike conventional RAG pipelines, SPAR employs a lightweight two-stage process: a semantic Metadata Index is first created, after which session-specific vector databases are dynamically generated on demand.<n>This design reduces computational overhead while improving transparency, controllability, and relevance in retrieval.
arXiv Detail & Related papers (2025-12-15T02:54:10Z) - RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge [0.0]
Retrieval-Augmented Generation (RAG) has established itself as the standard paradigm for grounding Large Language Models (LLMs) in domain-specific, up-to-date data.<n>RAGdb consolidates automated multimodal ingestion, ONNX-based extraction, and hybrid vector retrieval into a single, portable container.<n>System reduces disk footprint by approximately 99.5% compared to standard Docker-based RAG stacks.
arXiv Detail & Related papers (2025-12-09T15:12:13Z) - Towards Hyper-Efficient RAG Systems in VecDBs: Distributed Parallel Multi-Resolution Vector Search [5.216774377033164]
We propose textbfSemantic Pyramid Indexing (SPI), a novel multi-resolution vector indexing framework for RAG in VecDBs.<n>Unlike existing hierarchical methods that require offline tuning or separate model training, SPI constructs a semantic pyramid over document embeddings and dynamically selects the optimal resolution level per query.<n>We implement SPI as a plugin for both FAISS and Qdrant backends and evaluate it across multiple RAG tasks including MS MARCO, Natural Questions, and multimodal retrieval benchmarks.
arXiv Detail & Related papers (2025-11-12T09:31:08Z) - The Hybrid Multimodal Graph Index (HMGI): A Comprehensive Framework for Integrated Relational and Vector Search [6.821769033209393]
This paper introduces the Hybrid Multimodal Graph Index (HMGI), a novel framework designed to bridge the gap between vector databases and graph databases.<n>By integrating semantic similarity search directly with relational context, HMGI aims to outperform pure vector databases in complex, relational-heavy query scenarios.
arXiv Detail & Related papers (2025-10-11T09:06:26Z) - HetaRAG: Hybrid Deep Retrieval-Augmented Generation across Heterogeneous Data Stores [33.795387138571286]
HetaRAG is a hybrid, deep-retrieval augmented generation framework that orchestrates cross-modal evidence from heterogeneous data stores.<n>HetaRAG unifies vector indices, knowledge graphs, full-text engines, and structured databases into a single retrieval plane.
arXiv Detail & Related papers (2025-09-12T06:12:59Z) - Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models [83.65386456026441]
Data-Juicer 2.0 is a data processing system backed by 100+ data processing operators spanning text, image, video, and audio modalities.<n>It supports more critical tasks including data analysis, synthesis, annotation, and foundation model post-training.<n>The system is publicly available and has been widely adopted in diverse research fields and real-world products such as Alibaba Cloud PAI.
arXiv Detail & Related papers (2024-12-23T08:29:57Z) - A Collaborative Multi-Agent Approach to Retrieval-Augmented Generation Across Diverse Data [0.0]
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs)<n>Traditional RAG systems typically use a single-agent architecture to handle query generation, data retrieval, and response synthesis.<n>This paper proposes a multi-agent RAG system to address these limitations.
arXiv Detail & Related papers (2024-12-08T07:18:19Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling [54.05517338122698]
A popular similarity-based feature upsampling pipeline has been proposed, which utilizes a high-resolution feature as guidance.<n>We propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives.<n>We develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts.
arXiv Detail & Related papers (2024-07-02T14:12:21Z) - DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [60.09774333024783]
We introduce Dynamic Anchor Queries (DAQ) to shorten the transition gap between the anchor and target queries.
We also introduce a query-level object Emergence and Disappearance Simulation (EDS) strategy, which unleashes DAQ's potential without any additional cost.
Experiments demonstrate that DVIS-DAQ achieves a new state-of-the-art (SOTA) performance on five mainstream video segmentation benchmarks.
arXiv Detail & Related papers (2024-03-29T17:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.