Related papers: DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers

DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers

URL: http://arxiv.org/abs/2502.18460v1
Date: Tue, 25 Feb 2025 18:59:07 GMT
Title: DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
Authors: Xueguang Ma, Xi Victoria Lin, Barlas Oguz, Jimmy Lin, Wen-tau Yih, Xilun Chen,
Abstract summary: Large language models (LLMs) have demonstrated strong effectiveness and robustness while fine-tuned as dense retrievers.<n>LLMs offer better efficiency, but they often fail to generalize effectively with limited supervised fine-tuning data.<n>We introduce DRAMA, a training framework that leverages LLMs to train smaller generalizable dense retrievers.
Score: 86.54316283425001
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have demonstrated strong effectiveness and robustness while fine-tuned as dense retrievers. However, their large parameter size brings significant inference time computational challenges, including high encoding costs for large-scale corpora and increased query latency, limiting their practical deployment. While smaller retrievers offer better efficiency, they often fail to generalize effectively with limited supervised fine-tuning data. In this work, we introduce DRAMA, a training framework that leverages LLMs to train smaller generalizable dense retrievers. In particular, we adopt pruned LLMs as the backbone and train on diverse LLM-augmented data in a single-stage contrastive learning setup. Experiments show that DRAMA offers better multilingual and long-context capabilities than traditional encoder-based retrievers, and achieves strong performance across multiple tasks and languages. These highlight the potential of connecting the training of smaller retrievers with the growing advancements in LLMs, bridging the gap between efficiency and generalization.

Related papers

Making Large Language Models Efficient Dense Retrievers [24.74245913953312]
EffiR is a framework for developing efficient retrievers that performs large-scale compression through a coarse-to-fine strategy.<n>EffiR achieves substantial reductions in model size and inference cost while preserving the performance of full-size models.
arXiv Detail & Related papers (2025-12-23T18:58:25Z)
LLM2IR: simple unsupervised contrastive learning makes long-context LLM great retriever [2.1550607598300617]
In this paper, we introduce LLM2IR, an efficient unsupervised contrastive learning framework to convert any decoder-only large language model to an information retrieval model.<n>Despite its simplicity, the effectiveness is proven among different LLMs on multiple IR benchmarks including LoCo, LongEmbed and BEIR.<n>We also find that models with a longer context length tend to have a stronger IR capacity by comparing task performances of models in the same model family.
arXiv Detail & Related papers (2025-10-31T21:30:37Z)
Cost-Optimal Grouped-Query Attention for Long-Context LLMs [64.90662568387683]
Building effective Transformer-based large language models (LLMs) has recently become a research focus. We compare models with different parameter sizes, context lengths, and attention head configurations in terms of model performance, computational cost, and memory cost. Our studies show that, when processing sufficiently long sequences, a larger model with fewer attention heads can achieve a lower loss while incurring lower computational and memory costs.
arXiv Detail & Related papers (2025-03-12T17:50:42Z)
ExpandR: Teaching Dense Retrievers Beyond Queries with LLM Guidance [21.777817032607405]
Large language models (LLMs) have demonstrated significant potential in enhancing dense retrieval through query augmentation.<n>In this work, we propose ExpandR, a unified LLM-augmented dense retrieval framework.<n> Experimental results on multiple benchmarks show that ExpandR consistently outperforms strong baselines.
arXiv Detail & Related papers (2025-02-24T11:15:41Z)
Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages [10.418542753869433]
Low-resource languages (LRLs) face significant challenges in natural language processing (NLP) due to limited data.<n>Current state-of-the-art large language models (LLMs) still struggle with LRLs.<n>Small multilingual models (mLMs) such as mBERT and XLM-R offer greater promise due to a better fit of their capacity to low training data sizes.
arXiv Detail & Related papers (2025-02-14T13:10:39Z)
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet [55.39571645315926]
Large Language Models (LLMs) rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages.<n>For low-resource languages, the limited availability of such data hampers the models' ability to generalize effectively.<n>We present an empirical study investigating the effectiveness of several approaches for boosting LLMs' performance on low-resource languages.
arXiv Detail & Related papers (2025-01-31T12:23:28Z)
Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation [20.420575358183687]
Retrieval-Augmented Generation (RAG) has proven to be an effective method for mitigating hallucination issues inherent in large language models (LLMs) Previous approaches typically train retrievers based on semantic similarity, lacking optimization for RAG. We propose a novel framework, FiGRet, which leverages the language capabilities of LLMs to construct examples from a more granular, information-centric perspective.
arXiv Detail & Related papers (2024-11-06T14:42:39Z)
Why Does the Effective Context Length of LLMs Fall Short? [68.34573617977013]
In this work, we introduce ShifTed Rotray position embeddING (STRING) STRING shifts well-trained positions to overwrite the original ineffective positions during inference, enhancing performance within their existing training lengths. Experimental results show that STRING dramatically improves the performance of the latest large-scale models.
arXiv Detail & Related papers (2024-10-24T13:51:50Z)
Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment [16.39696580487218]
Pretrained language models like BERT and T5 serve as crucial backbone encoders for dense retrieval. Recent research has explored using large language models (LLMs) as retrievers, achieving SOTA performance across various tasks.
arXiv Detail & Related papers (2024-08-22T08:16:07Z)
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models [90.14693869269519]
MoE LLMs can achieve higher performance with fewer parameters, but it is still hard to deploy them due to their immense parameter sizes. This paper mainly aims to enhance the deployment efficiency of MoE LLMs by introducing plug-and-play expert-level sparsification techniques.
arXiv Detail & Related papers (2024-02-22T18:56:07Z)
ARL2: Aligning Retrievers for Black-box Large Language Models via Self-guided Adaptive Relevance Labeling [20.022332182475672]
ARL2 is a retriever learning technique that harnesses large language models as labelers. ARL2 uses an adaptive self-training strategy for curating high-quality and diverse relevance data. Experiments demonstrate the effectiveness of ARL2, achieving accuracy improvements of 5.4% on NQ and 4.6% on MMLU.
arXiv Detail & Related papers (2024-02-21T05:41:34Z)
Efficient Multimodal Learning from Data-centric Perspective [21.35857180519653]
We introduce Bunny, a family of lightweight MLLMs with flexible vision and language backbones for efficient multimodal learning. Experiments show that our Bunny-4B/8B outperforms the state-of-the-art large MLLMs on multiple benchmarks.
arXiv Detail & Related papers (2024-02-18T10:09:10Z)
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs [67.38165028487242]
We introduce Dynamic Sparse No Training (DSnoT), a training-free fine-tuning approach to fine-tune large language models (LLMs) Inspired by the Dynamic Sparse Training, DSnoT minimizes the reconstruction error between the dense and sparse LLMs. Our paper offers fresh insights into how to fine-tune sparse LLMs in an efficient training-free manner and open new venues to scale the great potential of sparsity to LLMs.
arXiv Detail & Related papers (2023-10-13T07:38:52Z)
LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset. Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.