Related papers: B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index

B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index

URL: http://arxiv.org/abs/2511.15557v1
Date: Wed, 19 Nov 2025 15:50:28 GMT
Title: B+ANN: A Fast Billion-Scale Disk-based Nearest-Neighbor Index
Authors: Selim Furkan Tekin, Rajesh Bordawekar,
Abstract summary: We present a novel disk-based ANN index, B+ANN, to address issues of the HNSW algorithm.<n>It partitions input data into blocks containing semantically similar items, then builds an B+ tree variant to store blocks both in-memory and on disks.<n>It improves both quality (Recall value), and execution performance (Queries per second/QPS) over HNSW.
Score: 3.4720326275852
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Storing and processing of embedding vectors by specialized Vector databases (VDBs) has become the linchpin in building modern AI pipelines. Most current VDBs employ variants of a graph-based ap- proximate nearest-neighbor (ANN) index algorithm, HNSW, to an- swer semantic queries over stored vectors. Inspite of its wide-spread use, the HNSW algorithm suffers from several issues: in-memory design and implementation, random memory accesses leading to degradation in cache behavior, limited acceleration scope due to fine-grained pairwise computations, and support of only semantic similarity queries. In this paper, we present a novel disk-based ANN index, B+ANN, to address these issues: it first partitions input data into blocks containing semantically similar items, then builds an B+ tree variant to store blocks both in-memory and on disks, and finally, enables hybrid edge- and block-based in-memory traversals. As demonstrated by our experimantal evaluation, the proposed B+ANN disk-based index improves both quality (Recall value), and execution performance (Queries per second/QPS) over HNSW, by improving spatial and temporal locality for semantic operations, reducing cache misses (19.23% relative gain), and decreasing the memory consumption and disk-based build time by 24x over the DiskANN algorithm. Finally, it enables dissimilarity queries, which are not supported by similarity-oriented ANN indices.

Related papers

Multi-Vector Index Compression in Any Modality [73.7330345057813]
Late interaction has emerged as a dominant paradigm for information retrieval in text, images, visual documents, and videos.<n>We introduce four approaches for index compression: sequence resizing, memory tokens, hierarchical pooling, and a novel attention-guided clustering (AGC)<n>AGC uses an attention-guided mechanism to identify the most semantically salient regions of a document as cluster centroids and to weight token aggregation.
arXiv Detail & Related papers (2026-02-24T18:57:33Z)
PiPNN: Ultra-Scalable Graph-Based Nearest Neighbor Indexing [6.243421401579046]
PiPNN is an ultra-scalable graph construction algorithm that avoids the search bottleneck'' that existing graph-based methods suffer from.<n>HashPrune enables PiPNN to partition the dataset into overlapping sub-problems, efficiently perform bulk distance comparisons, and stream a subset of the edges into HashPrune.<n>PiPNN builds state-of-the-art indexes up to 11.6x faster than Vamana and up to 12.9x faster than HNSW.
arXiv Detail & Related papers (2026-02-17T02:18:17Z)
DGAI: Decoupled On-Disk Graph-Based ANN Index for Efficient Updates and Queries [14.147837695174722]
On-disk graph-based indexes are widely used in approximate nearest neighbor (ANN) search systems for large-scale, high-dimensional vectors.<n>Traditional coupled storage methods, which store vectors within the index, are inefficient for index updates.<n>We propose a decoupled storage architecture, while a decoupled architecture reduces query performance.
arXiv Detail & Related papers (2025-10-29T11:20:10Z)
Scalable Disk-Based Approximate Nearest Neighbor Search with Page-Aligned Graph [3.994346326254537]
We propose PageANN, a disk-based approximate nearest neighbor search (ANNS) framework for high performance and scalability.<n>Results show that PageANN significantly outperforms state-of-the-art (SOTA) disk-based ANNS methods, achieving 1.85x-10.83x higher throughput and 51.7%-91.9% lower latency across different datasets and memory budgets.
arXiv Detail & Related papers (2025-09-29T20:44:13Z)
HAKES: Scalable Vector Database for Embedding Search Service [16.034584281180006]
We build a vector database that achieves high throughput and high recall under concurrent read-write workloads.<n>Our index outperforms index baselines in the high recall region and under concurrent read-write workloads.<n>namesys is scalable and achieves up to $16times$ higher throughputs than the baselines.
arXiv Detail & Related papers (2025-05-18T19:26:29Z)
On Storage Neural Network Augmented Approximate Nearest Neighbor Search [1.3654846342364308]
It is necessary to search for the most similar vectors to a given query vector from the data stored in storage devices, not from that in memory.<n>In this paper, we propose a method to predict the correct clusters by means of a neural network.<n>Compared to state-of-the-art SPANN and an exhaustive method using k-means clustering and linear search, the proposed method achieves 90% recall on SIFT1M with 80% and 58% less data fetched from storage, respectively.
arXiv Detail & Related papers (2025-01-23T06:56:18Z)
FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [52.57327385675752]
Direct 2D-3D matching requires significantly less memory but suffers from lower accuracy due to the larger and more ambiguous search space.<n>We address this ambiguity by fusing local and global descriptors using a weighted average operator.<n>We achieve performance close to hierarchical methods while using 43% less memory and running 1.6 times faster.
arXiv Detail & Related papers (2024-08-21T23:42:16Z)
fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence [55.582429009401956]
fVDB is a novel framework for deep learning on large-scale 3D data.<n>Our framework is fully integrated with PyTorch enabling interoperability with existing pipelines.
arXiv Detail & Related papers (2024-07-01T20:20:33Z)
Semi-Parametric Retrieval via Binary Bag-of-Tokens Index [71.78109794895065]
SemI-parametric Disentangled Retrieval (SiDR) is a bi-encoder retrieval framework that decouples retrieval index from neural parameters.<n>SiDR supports a non-parametric tokenization index for search, achieving BM25-like indexing complexity with significantly better effectiveness.
arXiv Detail & Related papers (2024-05-03T08:34:13Z)
AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval [1.099532646524593]
We propose All-in-Storage ANNS with Product Quantization (AiSAQ), which offloads compressed vectors to the SSD index.<n>Our method achieves $sim$10 MB memory usage in query search with billion-scale datasets without critical latency degradation.
arXiv Detail & Related papers (2024-04-09T04:20:27Z)
Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed. We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords. Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z)
Accelerating Barnes-Hut t-SNE Algorithm by Efficient Parallelization on Multi-Core CPUs [59.18990342943095]
t-SNE remains one of the most popular embedding techniques for visualizing high-dimensional data. BH t-SNE algorithm is inefficient on existing CPU implementations. Acc-t-SNE is up to 261x and 4x faster than scikit-learn and the state-of-the-art BH t-SNE implementation from daal4py.
arXiv Detail & Related papers (2022-12-22T06:38:40Z)
ASH: A Modern Framework for Parallel Spatial Hashing in 3D Perception [91.24236600199542]
ASH is a modern and high-performance framework for parallel spatial hashing on GPU. ASH achieves higher performance, supports richer functionality, and requires fewer lines of code. ASH and its example applications are open sourced in Open3D.
arXiv Detail & Related papers (2021-10-01T16:25:40Z)
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation [68.45737688496654]
We establish correspondences directly between frames without re-encoding the mask features for every object. With the correspondences, every node in the current query frame is inferred by aggregating features from the past in an associative fashion. We validated that every memory node now has a chance to contribute, and experimentally showed that such diversified voting is beneficial to both memory efficiency and inference accuracy.
arXiv Detail & Related papers (2021-06-09T16:50:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.