Related papers: MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation

MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation

URL: http://arxiv.org/abs/2412.10313v1
Date: Fri, 13 Dec 2024 17:53:29 GMT
Title: MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation
Authors: Yash Malviya, Karan Dhingra, Maneesh Singh,
Abstract summary: We present a system that adapts the retriever performance to the target domain using a multi-stage tuning strategy.<n>We benchmark the system performance on the dataset released for the RIRAG challenge.<n>We achieve significant performance gains obtaining a top rank on the RegNLP challenge leaderboard.
Score: 7.552430488883876
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Regulatory documents are rich in nuanced terminology and specialized semantics. FRAG systems: Frozen retrieval-augmented generators utilizing pre-trained (or, frozen) components face consequent challenges with both retriever and answering performance. We present a system that adapts the retriever performance to the target domain using a multi-stage tuning (MST) strategy. Our retrieval approach, called MST-R (a) first fine-tunes encoders used in vector stores using hard negative mining, (b) then uses a hybrid retriever, combining sparse and dense retrievers using reciprocal rank fusion, and then (c) adapts the cross-attention encoder by fine-tuning only the top-k retrieved results. We benchmark the system performance on the dataset released for the RIRAG challenge (as part of the RegNLP workshop at COLING 2025). We achieve significant performance gains obtaining a top rank on the RegNLP challenge leaderboard. We also show that a trivial answering approach games the RePASs metric outscoring all baselines and a pre-trained Llama model. Analyzing this anomaly, we present important takeaways for future research.

Related papers

LTRR: Learning To Rank Retrievers for LLMs [53.285436927963865]
We show that routing-based RAG systems can outperform the best single-retriever-based systems.<n>Performance gains are especially pronounced in models trained with the Answer Correctness (AC) metric.<n>As part of the SIGIR 2025 LiveRAG challenge, our submitted system demonstrated the practical viability of our approach.
arXiv Detail & Related papers (2025-06-16T17:53:18Z)
Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning [51.54046200512198]
Retrieval-augmented generation (RAG) is extensively utilized to incorporate external, current knowledge into large language models. A standard RAG pipeline may comprise several components, such as query rewriting, document retrieval, document filtering, and answer generation. To overcome these challenges, we propose treating the RAG pipeline as a multi-agent cooperative task, with each component regarded as an RL agent.
arXiv Detail & Related papers (2025-01-25T14:24:50Z)
Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer. Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z)
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation [12.966494167631113]
Retrieval-augmented Generation (RAG) relies on effective retrieval capabilities.<n>Traditional sparse and dense retrievers inherently struggle with multi-hop retrieval scenarios.<n>We introduce GeAR, a system that advances RAG performance through two key innovations.
arXiv Detail & Related papers (2024-12-24T13:45:22Z)
Inference Scaling for Bridging Retrieval and Augmented Generation [47.091086803980765]
Retrieval-augmented generation (RAG) has emerged as a popular approach to steering the output of a large language model (LLM) We show such bias can be mitigated, from inference scaling, aggregating inference calls from the permuted order of retrieved contexts. We showcase the effectiveness of MOI on diverse RAG tasks, improving ROUGE-L on MS MARCO and EM on HotpotQA benchmarks by 7 points.
arXiv Detail & Related papers (2024-12-14T05:06:43Z)
Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers [6.773411876899064]
inference-free sparse models lag far behind in terms of search relevance when compared to both sparse and dense siamese models. We propose two different approaches for performance improvement. First, we introduce the IDF-aware FLOPS loss, which introduces Inverted Document Frequency (IDF) to the sparsification of representations. We find that it mitigates the negative impact of the FLOPS regularization on search relevance, allowing the model to achieve a better balance between accuracy and efficiency.
arXiv Detail & Related papers (2024-11-07T03:46:43Z)
FunnelRAG: A Coarse-to-Fine Progressive Retrieval Paradigm for RAG [22.4664221738095]
Retrieval-Augmented Generation (RAG) prevails in Large Language Models. We propose a progressive retrieval paradigm with coarse-to-fine granularity for RAG.
arXiv Detail & Related papers (2024-10-14T08:47:21Z)
FIRST: Faster Improved Listwise Reranking with Single Token Decoding [56.727761901751194]
First, we introduce FIRST, a novel listwise LLM reranking approach leveraging the output logits of the first generated identifier to directly obtain a ranked ordering of the candidates. Empirical results demonstrate that FIRST accelerates inference by 50% while maintaining a robust ranking performance with gains across the BEIR benchmark. Our results show that LLM rerankers can provide a stronger distillation signal compared to cross-encoders, yielding substantial improvements in retriever recall after relevance feedback.
arXiv Detail & Related papers (2024-06-21T21:27:50Z)
Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers [0.0]
Retrieval-Augmented Generation (RAG) is a prevalent approach to infuse a private knowledge base of documents with Large Language Models (LLM) to build Generative Q&A (Question-Answering) systems. We propose the 'Blended RAG' method of leveraging semantic search techniques, such as Vector indexes and Sparse indexes, blended with hybrid query strategies. Our study achieves better retrieval results and sets new benchmarks for IR (Information Retrieval) datasets like NQ and TREC-COVID datasets.
arXiv Detail & Related papers (2024-03-22T17:13:46Z)
Boot and Switch: Alternating Distillation for Zero-Shot Dense Retrieval [50.47192086219752]
$texttABEL$ is a simple but effective unsupervised method to enhance passage retrieval in zero-shot settings. By either fine-tuning $texttABEL$ on labelled data or integrating it with existing supervised dense retrievers, we achieve state-of-the-art results.
arXiv Detail & Related papers (2023-11-27T06:22:57Z)
Sequencing Matters: A Generate-Retrieve-Generate Model for Building Conversational Agents [9.191944519634111]
The Georgetown InfoSense group has done in regard to solving the challenges presented by TREC iKAT 2023. Our submitted runs outperform the median runs by a significant margin, exhibiting superior performance in nDCG across various cut numbers and in overall success rate. Our solution involves the use of Large Language Models (LLMs) for initial answers, answer grounding by BM25, passage quality filtering by logistic regression, and answer generation by LLMs again.
arXiv Detail & Related papers (2023-11-16T02:37:58Z)
ReFIT: Relevance Feedback from a Reranker during Inference [109.33278799999582]
Retrieve-and-rerank is a prevalent framework in neural information retrieval. We propose to leverage the reranker to improve recall by making it provide relevance feedback to the retriever at inference time.
arXiv Detail & Related papers (2023-05-19T15:30:33Z)
Adversarial Retriever-Ranker for dense text retrieval [51.87158529880056]
We present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker. AR2 consistently and significantly outperforms existing dense retriever methods. This includes the improvements on Natural Questions R@5 to 77.9%(+2.1%), TriviaQA R@5 to 78.2%(+1.4), and MS-MARCO MRR@10 to 39.5%(+1.3%)
arXiv Detail & Related papers (2021-10-07T16:41:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.