LLM2IR: simple unsupervised contrastive learning makes long-context LLM great retriever
- URL: http://arxiv.org/abs/2601.05262v1
- Date: Fri, 31 Oct 2025 21:30:37 GMT
- Title: LLM2IR: simple unsupervised contrastive learning makes long-context LLM great retriever
- Authors: Xiaocong Yang,
- Abstract summary: In this paper, we introduce LLM2IR, an efficient unsupervised contrastive learning framework to convert any decoder-only large language model to an information retrieval model.<n>Despite its simplicity, the effectiveness is proven among different LLMs on multiple IR benchmarks including LoCo, LongEmbed and BEIR.<n>We also find that models with a longer context length tend to have a stronger IR capacity by comparing task performances of models in the same model family.
- Score: 2.1550607598300617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern dense information retrieval (IR) models usually rely on costly large-scale pretraining. In this paper, we introduce LLM2IR, an efficient unsupervised contrastive learning framework to convert any decoder-only large language model (LLM) to an information retrieval model. Despite its simplicity, the effectiveness is proven among different LLMs on multiple IR benchmarks including LoCo, LongEmbed and BEIR. We also find that models with a longer context length tend to have a stronger IR capacity by comparing task performances of models in the same model family. Our work not only provides an effective way to build IR models on the state-of-the-art LLMs, but also shed light on the relationship between information retrieval ability and model context length, which helps the design of better information retrievers.
Related papers
- LightPROF: A Lightweight Reasoning Framework for Large Language Model on Knowledge Graph [57.382255728234064]
Large Language Models (LLMs) have impressive capabilities in text understanding and zero-shot reasoning.<n> Knowledge Graphs (KGs) provide rich and reliable contextual information for the reasoning process of LLMs.<n>We propose a novel Lightweight and efficient Prompt learning-ReasOning Framework for KGQA (LightPROF)
arXiv Detail & Related papers (2025-04-04T03:03:47Z) - DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers [86.54316283425001]
Large language models (LLMs) have demonstrated strong effectiveness and robustness while fine-tuned as dense retrievers.<n>LLMs offer better efficiency, but they often fail to generalize effectively with limited supervised fine-tuning data.<n>We introduce DRAMA, a training framework that leverages LLMs to train smaller generalizable dense retrievers.
arXiv Detail & Related papers (2025-02-25T18:59:07Z) - Learning to Reduce: Towards Improving Performance of Large Language Models on Structured Data [39.29778853025738]
Large Language Models (LLMs) have been achieving competent performance on a wide range of downstream tasks.
This paper proposes a framework, Learning to Reduce, that fine-tunes a language model with On-Policy Learning to generate a reduced version of an input structured data.
arXiv Detail & Related papers (2024-07-03T01:51:50Z) - Self-Retrieval: End-to-End Information Retrieval with One Large Language Model [97.71181484082663]
We introduce Self-Retrieval, a novel end-to-end LLM-driven information retrieval architecture.
Self-Retrieval internalizes the retrieval corpus through self-supervised learning, transforms the retrieval process into sequential passage generation, and performs relevance assessment for reranking.
arXiv Detail & Related papers (2024-02-23T18:45:35Z) - Small Language Models as Effective Guides for Large Language Models in Chinese Relation Extraction [7.8274692517258435]
In this paper, we propose SLCoLM, a model collaboration framework, to mitigate the data long-tail problem.<n>In our framework, we use the textitTraining-Guide-Predict'' strategy to combine the strengths of small pre-trained language models (SLMs) and large language models (LLMs)<n>Our experiments on an ancient Chinese RE dataset rich in relation types show that the approach facilitates RE of long-tail relation types.
arXiv Detail & Related papers (2024-02-22T08:26:56Z) - Learning to Reduce: Optimal Representations of Structured Data in
Prompting Large Language Models [42.16047343029512]
Large Language Models (LLMs) have been widely used as general-purpose AI agents.
We propose a framework, Learning to Reduce, that fine-tunes a language model to generate a reduced version of an input context.
We show that our model achieves comparable accuracies in selecting the relevant evidence from an input context.
arXiv Detail & Related papers (2024-02-22T00:41:23Z) - Knowledge Fusion of Large Language Models [73.28202188100646]
This paper introduces the notion of knowledge fusion for large language models (LLMs)
We externalize their collective knowledge and unique strengths, thereby elevating the capabilities of the target model beyond those of any individual source LLM.
Our findings confirm that the fusion of LLMs can improve the performance of the target model across a range of capabilities such as reasoning, commonsense, and code generation.
arXiv Detail & Related papers (2024-01-19T05:02:46Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - Small Language Models Improve Giants by Rewriting Their Outputs [18.025736098795296]
We tackle the problem of leveraging training data to improve the performance of large language models (LLMs) without fine-tuning.
We create a pool of candidates from the LLM through few-shot prompting and we employ a compact model, the LM-corrector (LMCor), specifically trained to merge these candidates to produce an enhanced output.
Experiments on four natural language generation tasks demonstrate that even a small LMCor model (250M) substantially improves the few-shot performance of LLMs (62B), matching and even outperforming standard fine-tuning.
arXiv Detail & Related papers (2023-05-22T22:07:50Z) - Augmenting Interpretable Models with LLMs during Training [73.40079895413861]
We propose Augmented Interpretable Models (Aug-imodels) to build efficient and interpretable models.
Aug-imodels use LLMs during fitting but not during inference, allowing complete transparency.
We explore two instantiations of Aug-imodels in natural-language processing: (i) Aug-GAM, which augments a generalized additive model with decoupled embeddings from an LLM and (ii) Aug-Tree, which augments a decision tree with LLM feature expansions.
arXiv Detail & Related papers (2022-09-23T18:36:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.