DGAI: Decoupled On-Disk Graph-Based ANN Index for Efficient Updates and Queries
- URL: http://arxiv.org/abs/2510.25401v1
- Date: Wed, 29 Oct 2025 11:20:10 GMT
- Title: DGAI: Decoupled On-Disk Graph-Based ANN Index for Efficient Updates and Queries
- Authors: Jiahao Lou, Quan Yu, Shufeng Gong, Song Yu, Yanfeng Zhang, Ge Yu,
- Abstract summary: On-disk graph-based indexes are widely used in approximate nearest neighbor (ANN) search systems for large-scale, high-dimensional vectors.<n>Traditional coupled storage methods, which store vectors within the index, are inefficient for index updates.<n>We propose a decoupled storage architecture, while a decoupled architecture reduces query performance.
- Score: 14.147837695174722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: On-disk graph-based indexes are widely used in approximate nearest neighbor (ANN) search systems for large-scale, high-dimensional vectors. However, traditional coupled storage methods, which store vectors within the index, are inefficient for index updates. Coupled storage incurs excessive redundant vector reads and writes when updating the graph topology, leading to significant invalid I/O. To address this issue, we propose a decoupled storage architecture. While a decoupled architecture reduces query performance. To overcome this limitation, we design two tailored strategies: (i) a three-stage query mechanism that leverages multiple PQ compressed vectors to filter invalid I/O and computations, and (ii) an incremental page-level topological reordering strategy that incrementally inserts new nodes into pages containing their most similar neighbors to mitigate read amplification. Together, these techniques substantially reduce both I/O and computational overhead during ANN search. Experimental results show that the decoupled architecture improves update speed by 10.05x for insertions and 6.89x for deletions, while the three-stage query and incremental reordering enhance query efficiency by 2.66x compared to the traditional coupled architecture.
Related papers
- Multiple Index Merge for Approximate Nearest Neighbor Search [14.386466486046814]
This paper focuses on efficient two-index merging and the merge order of multiple indexes for AKNN search.<n>We propose a reverse neighbor sliding merge (RNSM) that exploits structural information to boost merging efficiency.<n> Experiments show that our approach yields up to a 5.48$times$ speedup over existing index merge methods and 9.92$times$ speedup over index reconstruction.
arXiv Detail & Related papers (2026-02-19T05:50:34Z) - Curator: Efficient Vector Search with Low-Selectivity Filters [12.774238654446032]
Graph-based indexes achieve state-of-the-art performance on unfiltered ANNS but encounter connectivity breakdown on low-selectivity filtered queries.<n>Recent research proposes specialized graph indexes that address this issue by expanding graph degree.<n>We present Curator, a partition-based index that complements existing graph-based approaches for low-selectivity filtered ANNS.
arXiv Detail & Related papers (2026-01-03T21:35:01Z) - B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index [3.4720326275852]
We present a novel disk-based ANN index, B+ANN, to address issues of the HNSW algorithm.<n>It partitions input data into blocks containing semantically similar items, then builds an B+ tree variant to store blocks both in-memory and on disks.<n>It improves both quality (Recall value), and execution performance (Queries per second/QPS) over HNSW.
arXiv Detail & Related papers (2025-11-19T15:50:28Z) - TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework [62.66056331998838]
TeaRAG is a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps.<n>Our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps.
arXiv Detail & Related papers (2025-11-07T16:08:34Z) - Scalable Disk-Based Approximate Nearest Neighbor Search with Page-Aligned Graph [3.994346326254537]
We propose PageANN, a disk-based approximate nearest neighbor search (ANNS) framework for high performance and scalability.<n>Results show that PageANN significantly outperforms state-of-the-art (SOTA) disk-based ANNS methods, achieving 1.85x-10.83x higher throughput and 51.7%-91.9% lower latency across different datasets and memory budgets.
arXiv Detail & Related papers (2025-09-29T20:44:13Z) - HAKES: Scalable Vector Database for Embedding Search Service [16.034584281180006]
We build a vector database that achieves high throughput and high recall under concurrent read-write workloads.<n>Our index outperforms index baselines in the high recall region and under concurrent read-write workloads.<n>namesys is scalable and achieves up to $16times$ higher throughputs than the baselines.
arXiv Detail & Related papers (2025-05-18T19:26:29Z) - Semi-Parametric Retrieval via Binary Bag-of-Tokens Index [71.78109794895065]
SemI-parametric Disentangled Retrieval (SiDR) is a bi-encoder retrieval framework that decouples retrieval index from neural parameters.<n>SiDR supports a non-parametric tokenization index for search, achieving BM25-like indexing complexity with significantly better effectiveness.
arXiv Detail & Related papers (2024-05-03T08:34:13Z) - Similarity search in the blink of an eye with compressed indices [3.39271933237479]
Graph-based indices are currently the best performing techniques for billion-scale similarity search.
We present new techniques and systems for creating faster and smaller graph-based indices.
arXiv Detail & Related papers (2023-04-07T23:10:39Z) - Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - iDARTS: Differentiable Architecture Search with Stochastic Implicit
Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS)
We tackle the hypergradient computation in DARTS based on the implicit function theorem.
We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z) - Towards Improving the Consistency, Efficiency, and Flexibility of
Differentiable Neural Architecture Search [84.4140192638394]
Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation.
In this paper, we introduce EnTranNAS that is composed of Engine-cells and Transit-cells.
Our method also spares much memory and computation cost, which speeds up the search process.
arXiv Detail & Related papers (2021-01-27T12:16:47Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - The Case for Learned Spatial Indexes [62.88514422115702]
We use techniques proposed from a state-of-the art learned multi-dimensional index structure (namely, Flood) to answer spatial range queries.
We show that (i) machine learned search within a partition is faster by 11.79% to 39.51% than binary search when using filtering on one dimension.
We also refine using machine learned indexes is 1.23x to 1.83x times faster than closest competitor which filters on two dimensions.
arXiv Detail & Related papers (2020-08-24T12:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.