Related papers: VIBE: Vector Index Benchmark for Embeddings

VIBE: Vector Index Benchmark for Embeddings

URL: http://arxiv.org/abs/2505.17810v1
Date: Fri, 23 May 2025 12:28:10 GMT
Title: VIBE: Vector Index Benchmark for Embeddings
Authors: Elias Jääsaari, Ville Hyvönen, Matteo Ceccarello, Teemu Roos, Martin Aumüller,
Abstract summary: We introduce Vector Index Benchmark for Embeddings (VIBE), an open source project for benchmarking ANN algorithms.<n>VIBE contains a pipeline for creating benchmark datasets using dense embedding models characteristic of modern applications.<n>We use VIBE to conduct a comprehensive evaluation of SOTA vector indexes, benchmarking 21 implementations on 12 in-distribution and 6 out-of-distribution datasets.
Score: 5.449089394751681
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Approximate nearest neighbor (ANN) search is a performance-critical component of many machine learning pipelines. Rigorous benchmarking is essential for evaluating the performance of vector indexes for ANN search. However, the datasets of the existing benchmarks are no longer representative of the current applications of ANN search. Hence, there is an urgent need for an up-to-date set of benchmarks. To this end, we introduce Vector Index Benchmark for Embeddings (VIBE), an open source project for benchmarking ANN algorithms. VIBE contains a pipeline for creating benchmark datasets using dense embedding models characteristic of modern applications, such as retrieval-augmented generation (RAG). To replicate real-world workloads, we also include out-of-distribution (OOD) datasets where the queries and the corpus are drawn from different distributions. We use VIBE to conduct a comprehensive evaluation of SOTA vector indexes, benchmarking 21 implementations on 12 in-distribution and 6 out-of-distribution datasets.

Related papers

NaviX: A Native Vector Index Design for Graph DBMSs With Robust Predicate-Agnostic Search Performance [7.108581652658526]
We present NaviX, a native vector index for graphs (GDBMSs)<n> NaviX is built on the Hierarchical Navigable Small-World (HNSW) graph, which itself is a graph-based structure.
arXiv Detail & Related papers (2025-06-29T21:16:07Z)
HAKES: Scalable Vector Database for Embedding Search Service [16.034584281180006]
We build a vector database that achieves high throughput and high recall under concurrent read-write workloads.<n>Our index outperforms index baselines in the high recall region and under concurrent read-write workloads.<n>namesys is scalable and achieves up to $16times$ higher throughputs than the baselines.
arXiv Detail & Related papers (2025-05-18T19:26:29Z)
Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? [62.57689536630933]
We provide experimental results on the BEIR dataset using the open-source Lucene search library. Our results provide guidance for today's search practitioner in understanding the design space of dense and sparse retrievers.
arXiv Detail & Related papers (2024-09-10T12:46:23Z)
Do Text-to-Vis Benchmarks Test Real Use of Visualisations? [11.442971909006657]
This paper investigates whether benchmarks reflect real-world use through an empirical study comparing benchmark datasets with code from public repositories. Our findings reveal a substantial gap, with evaluations not testing the same distribution of chart types, attributes, and actions as real-world examples. One dataset is representative, but requires extensive modification to become a practical end-to-end benchmark. This shows that new benchmarks are needed to support the development of systems that truly address users' visualisation needs.
arXiv Detail & Related papers (2024-07-29T06:13:28Z)
The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds [0.09208007322096533]
Investigation focuses on HNSW's efficacy across a spectrum of datasets. We discover that the recall of approximate HNSW search, in comparison to exact K Nearest Neighbours (KNN) search, is linked to the vector space's intrinsic dimensionality. We observe that running popular benchmark datasets with HNSW instead of KNN can shift rankings by up to three positions for some models.
arXiv Detail & Related papers (2024-05-28T04:16:43Z)
Semi-Parametric Retrieval via Binary Bag-of-Tokens Index [71.78109794895065]
SemI-parametric Disentangled Retrieval (SiDR) is a bi-encoder retrieval framework that decouples retrieval index from neural parameters.<n>SiDR supports a non-parametric tokenization index for search, achieving BM25-like indexing complexity with significantly better effectiveness.
arXiv Detail & Related papers (2024-05-03T08:34:13Z)
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code [161.1761414080574]
Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.
arXiv Detail & Related papers (2022-06-22T17:52:30Z)
Benchmarking Node Outlier Detection on Graphs [90.29966986023403]
Graph outlier detection is an emerging but crucial machine learning task with numerous applications. We present the first comprehensive unsupervised node outlier detection benchmark for graphs called UNOD.
arXiv Detail & Related papers (2022-06-21T01:46:38Z)
GRecX: An Efficient and Unified Benchmark for GNN-based Recommendation [55.55523188090938]
We present GRecX, an open-source framework for benchmarking GNN-based recommendation models. GRecX consists of core libraries for building GNN-based recommendation benchmarks, as well as the implementations of popular GNN-based recommendation models. We conduct experiments with GRecX, and the experimental results show that GRecX allows us to train and benchmark GNN-based recommendation baselines in an efficient and unified way.
arXiv Detail & Related papers (2021-11-19T17:45:46Z)
Searching towards Class-Aware Generators for Conditional Generative Adversarial Networks [132.29772160843825]
Conditional Generative Adversarial Networks (cGAN) were designed to generate images based on the provided conditions. Existing methods have used the same generating architecture for all classes. This paper presents a novel idea that adopts NAS to find a distinct architecture for each class.
arXiv Detail & Related papers (2020-06-25T07:05:28Z)
NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization [101.13851473792334]
We construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes. Compared with other real-world datasets, it contains various illumination scenes and has the largest density range (020,033) We describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data.
arXiv Detail & Related papers (2020-01-10T09:26:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.