Scaling Laws for Embedding Dimension in Information Retrieval
- URL: http://arxiv.org/abs/2602.05062v1
- Date: Wed, 04 Feb 2026 21:27:12 GMT
- Title: Scaling Laws for Embedding Dimension in Information Retrieval
- Authors: Julian Killingback, Mahta Rafiee, Madine Manas, Hamed Zamani,
- Abstract summary: We conduct a comprehensive analysis of the relationship between embedding dimension and retrieval performance.<n>We find that the scaling behavior fits a power law, allowing us to derive scaling laws for performance given only embedding dimension.<n>Our analysis shows that for evaluation tasks aligned with the training task, performance continues to improve as embedding size increases.
- Score: 26.21690287784803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dense retrieval, which encodes queries and documents into a single dense vector, has become the dominant neural retrieval approach due to its simplicity and compatibility with fast approximate nearest neighbor algorithms. As the tasks dense retrieval performs grow in complexity, the fundamental limitations of the underlying data structure and similarity metric -- namely vectors and inner-products -- become more apparent. Prior recent work has shown theoretical limitations inherent to single vectors and inner-products that are generally tied to the embedding dimension. Given the importance of embedding dimension for retrieval capacity, understanding how dense retrieval performance changes as embedding dimension is scaled is fundamental to building next generation retrieval models that balance effectiveness and efficiency. In this work, we conduct a comprehensive analysis of the relationship between embedding dimension and retrieval performance. Our experiments include two model families and a range of model sizes from each to construct a detailed picture of embedding scaling behavior. We find that the scaling behavior fits a power law, allowing us to derive scaling laws for performance given only embedding dimension, as well as a joint law accounting for embedding dimension and model size. Our analysis shows that for evaluation tasks aligned with the training task, performance continues to improve as embedding size increases, though with diminishing returns. For evaluation data that is less aligned with the training task, we find that performance is less predictable, with performance degrading with larger embedding dimensions for certain tasks. We hope our work provides additional insight into the limitations of embeddings and their behavior as well as offers a practical guide for selecting model and embedding dimension to achieve optimal performance with reduced storage and compute costs.
Related papers
- Scaling Laws for Reranking in Information Retrieval [24.00475965133032]
We present the first systematic study of scaling laws for rerankers.<n>Using a detailed case study with cross-encoder rerankers, we demonstrate that performance follows a predictable power law.<n>Our results establish scaling principles for reranking and provide actionable insights for building industrial-grade retrieval systems.
arXiv Detail & Related papers (2026-03-05T05:03:07Z) - Neural Scaling Laws for Boosted Jet Tagging [0.22399170518036912]
scaling compute, through joint increases in model capacity and dataset size, is the primary driver of performance in modern machine learning.<n>We derive compute optimal scaling laws and identify an effective performance limit that can be consistently approached through increased compute.<n>We then study how the scaling coefficients and performance limits vary with the choice of input features and particle multiplicity.
arXiv Detail & Related papers (2026-02-17T18:13:01Z) - Complexity Scaling Laws for Neural Models using Combinatorial Optimization [5.291101237151254]
We develop scaling laws based on problem complexity.<n>We analyze two fundamental complexity measures: solution space size and representation space size.<n>We show that optimization promotes smooth cost trends, and therefore meaningful scaling laws can be obtained even in the absence of an interpretable loss.
arXiv Detail & Related papers (2025-06-15T18:20:35Z) - Unified Scaling Laws for Compressed Representations [69.72517034565467]
We investigate whether a unified scaling framework can accurately predict model performance when training occurs over various compressed representations.<n>Our main finding is demonstrating both theoretically and empirically that there exists a simple "capacity" metric.<n>We extend our formulation to directly compare the accuracy potential of different compressed formats, and to derive better algorithms for training over sparse-quantized formats.
arXiv Detail & Related papers (2025-06-02T16:52:51Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - A Closer Look at Deep Learning Methods on Tabular Datasets [78.61845513154502]
We present an extensive study on TALENT, a collection of 300+ datasets spanning broad ranges of size.<n>Our evaluation shows that ensembling benefits both tree-based and neural approaches.
arXiv Detail & Related papers (2024-07-01T04:24:07Z) - Scaling Laws For Dense Retrieval [22.76001461620846]
We investigate whether the performance of dense retrieval models follows the scaling law as other neural models.
Results indicate that, under our settings, the performance of dense retrieval models follows a precise power-law scaling related to the model size and the number of annotations.
arXiv Detail & Related papers (2024-03-27T15:27:36Z) - Refined Coreset Selection: Towards Minimal Coreset Size under Model
Performance Constraints [69.27190330994635]
Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms.
We propose an innovative method, which maintains optimization priority order over the model performance and coreset size.
Empirically, extensive experiments confirm its superiority, often yielding better model performance with smaller coreset sizes.
arXiv Detail & Related papers (2023-11-15T03:43:04Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.