Efficiency Optimizations for Superblock-based Sparse Retrieval
- URL: http://arxiv.org/abs/2602.02883v1
- Date: Mon, 02 Feb 2026 22:42:22 GMT
- Title: Efficiency Optimizations for Superblock-based Sparse Retrieval
- Authors: Parker Carlson, Wentai Xie, Rohil Shah, Tao Yang,
- Abstract summary: This paper proposes a simple and effective superblock pruning scheme that reduces the overhead of superblock score while preserving competitive relevance.<n>It combines this scheme with a compact index structure and a robust zero-shot configuration that is effective across LSR models and multiple datasets.
- Score: 3.8177503173791756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learned sparse retrieval (LSR) is a popular method for first-stage retrieval because it combines the semantic matching of language models with efficient CPU-friendly algorithms. Previous work aggregates blocks into "superblocks" to quickly skip the visitation of blocks during query processing by using an advanced pruning heuristic. This paper proposes a simple and effective superblock pruning scheme that reduces the overhead of superblock score computation while preserving competitive relevance. It combines this scheme with a compact index structure and a robust zero-shot configuration that is effective across LSR models and multiple datasets. This paper provides an analytical justification and evaluation on the MS MARCO and BEIR datasets, demonstrating that the proposed scheme can be a strong alternative for efficient sparse retrieval.
Related papers
- SimpleMem: Efficient Lifelong Memory for LLM Agents [73.74399447715052]
We introduce SimpleMem, an efficient memory framework based on semantic lossless compression.<n>We propose a three-stage pipeline designed to maximize information density and token utilization.<n> Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost.
arXiv Detail & Related papers (2026-01-05T21:02:49Z) - An Enhanced Model-based Approach for Short Text Clustering [58.60681789677676]
Short text clustering has become increasingly important with the popularity of social media like Twitter, Google+, and Facebook.<n>Existing methods can be broadly categorized into two paradigms: topic model-based approaches and deep representation learning-based approaches.<n>We propose a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model (GSDMM), which effectively handles the sparsity and high dimensionality of short texts.<n>Based on several aspects of GSDMM that warrant further refinement, we propose an improved approach, GSDMM+, designed to further optimize its performance.
arXiv Detail & Related papers (2025-07-18T10:07:42Z) - HASH-RAG: Bridging Deep Hashing with Retriever for Efficient, Fine Retrieval and Augmented Generation [16.147618749631103]
Hash-RAG is a framework that integrates deep hashing techniques with systematic optimizations.<n>Building upon this hash-based efficient retrieval framework, we establish the foundation for fine-grained chunking.
arXiv Detail & Related papers (2025-05-22T02:22:11Z) - Billion-scale Similarity Search Using a Hybrid Indexing Approach with Advanced Filtering [49.1574468325115]
This paper presents a novel approach for similarity search with complex filtering capabilities on billion-scale datasets, optimized for CPU inference.<n>Our method extends the classical IVF-Flat index structure to integrate multi-dimensional filters.<n>The proposed algorithm combines dense embeddings with discrete filtering attributes, enabling fast retrieval in high-dimensional spaces.
arXiv Detail & Related papers (2025-01-23T07:47:00Z) - Efficient Long Context Language Model Retrieval with Compression [57.09163579304332]
Long Context Language Models (LCLMs) have emerged as a new paradigm to perform Information Retrieval (IR)<n>We propose a new compression approach tailored for LCLM retrieval, which is trained to maximize the retrieval performance while minimizing the length of the compressed passages.<n>We show that CoLoR improves the retrieval performance by 6% while compressing the in-context size by a factor of 1.91.
arXiv Detail & Related papers (2024-12-24T07:30:55Z) - Retrieval with Learned Similarities [2.729516456192901]
State-of-the-art retrieval algorithms have migrated to learned similarities.<n>We show that Mixture-of-Logits (MoL) can be realized empirically to achieve superior performance on diverse retrieval scenarios.
arXiv Detail & Related papers (2024-07-22T08:19:34Z) - Faster Learned Sparse Retrieval with Block-Max Pruning [11.080810272211906]
This paper introduces Block-Max Pruning (BMP), an innovative dynamic pruning strategy tailored for indexes arising in learned sparse retrieval environments.
BMP substantially outperforms existing dynamic pruning strategies, offering unparalleled efficiency in safe retrieval contexts.
arXiv Detail & Related papers (2024-05-02T09:26:30Z) - Constructing Tree-based Index for Efficient and Effective Dense
Retrieval [26.706985694158384]
JTR stands for Joint optimization of TRee-based index and query encoding.
We design a new unified contrastive learning loss to train tree-based index and query encoder in an end-to-end manner.
Experimental results show that JTR achieves better retrieval performance while retaining high system efficiency.
arXiv Detail & Related papers (2023-04-24T09:25:39Z) - Shapley-NAS: Discovering Operation Contribution for Neural Architecture
Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search.
We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - Structured Inverted-File k-Means Clustering for High-Dimensional Sparse
Data [2.487445341407889]
This paper presents an architecture-friendly k-means clustering algorithm called SIVF for a large-scale and high-dimensional sparse data set.
Our performance analysis reveals that SIVF achieves the higher speed by suppressing performance degradation factors of the number of cache misses and branch mispredictions.
arXiv Detail & Related papers (2021-03-30T07:54:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.