RMIT-ADM+S at the SIGIR 2025 LiveRAG Challenge
- URL: http://arxiv.org/abs/2506.14516v2
- Date: Wed, 23 Jul 2025 21:41:29 GMT
- Title: RMIT-ADM+S at the SIGIR 2025 LiveRAG Challenge
- Authors: Kun Ran, Shuoqi Sun, Khoi Nguyen Dinh Anh, Damiano Spina, Oleg Zendel,
- Abstract summary: This paper presents the RMIT--ADM+S winning system in the SIGIR 2025 LiveRAG Challenge.<n>Our Generation-Retrieval-Augmented Generation (G-RAG) approach generates a hypothetical answer that is used during the retrieval phase, alongside the original question.<n>G-RAG also incorporates a pointwise large language model (LLM)-based re-ranking step prior to final answer generation.
- Score: 4.364909807482374
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper presents the RMIT--ADM+S winning system in the SIGIR 2025 LiveRAG Challenge. Our Generation-Retrieval-Augmented Generation (G-RAG) approach generates a hypothetical answer that is used during the retrieval phase, alongside the original question. G-RAG also incorporates a pointwise large language model (LLM)-based re-ranking step prior to final answer generation. We describe the system architecture and the rationale behind our design choices. In particular, a systematic evaluation using the Grid of Points approach and N-way ANOVA enabled a controlled comparison of multiple configurations, including query variant generation, question decomposition, rank fusion strategies, and prompting techniques for answer generation. The submitted system achieved the highest Borda score based on the aggregation of Coverage, Relatedness, and Quality scores from manual evaluations, ranking first in the SIGIR 2025 LiveRAG Challenge.
Related papers
- Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z) - PreQRAG -- Classify and Rewrite for Enhanced RAG [1.652907918484303]
We introduce PreQRAG, a Retrieval Augmented Generation architecture designed to improve retrieval and generation quality.<n>PreQRAG incorporates a pipeline that first classifies each input question as either single-document or multi-document type.<n>For single-document questions, we employ question rewriting techniques to improve retrieval precision and generation relevance.<n>For multi-document questions, we decompose complex queries into focused sub-questions that can be processed more effectively.
arXiv Detail & Related papers (2025-06-20T22:02:05Z) - TopClustRAG at SIGIR 2025 LiveRAG Challenge [2.56711111236449]
TopClustRAG is a retrieval-augmented generation (RAG) system developed for the LiveRAG Challenge.<n>Our system employs a hybrid retrieval strategy combining sparse and dense indices, followed by K-Means clustering to group semantically similar passages.
arXiv Detail & Related papers (2025-06-18T08:24:27Z) - RAGtifier: Evaluating RAG Generation Approaches of State-of-the-Art RAG Systems for the SIGIR LiveRAG Competition [0.0]
The LiveRAG 2025 challenge explores RAG solutions to maximize accuracy on DataMorgana's QA pairs.<n>The challenge provides access to sparse OpenSearch and dense Pinecone indices of the Fineweb 10BT dataset.<n>Our solution achieved a correctness score of 1.13 and a faithfulness score of 0.55, placing fourth in the SIGIR 2025 LiveRAG Challenge.
arXiv Detail & Related papers (2025-06-17T11:14:22Z) - CIIR@LiveRAG 2025: Optimizing Multi-Agent Retrieval Augmented Generation through Self-Training [18.787703082459046]
mRAG is a multi-agent retrieval-augmented generation framework composed of specialized agents for subtasks such as planning, searching, reasoning, and coordination.<n> Evaluated on DataMorgana-derived datasets during the SIGIR 2025 LiveRAG competition, mRAG outperforms conventional RAG baselines.
arXiv Detail & Related papers (2025-06-12T16:02:29Z) - ESGenius: Benchmarking LLMs on Environmental, Social, and Governance (ESG) and Sustainability Knowledge [53.18163869901266]
ESGenius is a benchmark for evaluating and enhancing the proficiency of Large Language Models (LLMs) in Environmental, Social and Governance (ESG)<n> ESGenius comprises two key components: ESGenius-QA and ESGenius-Corpus.
arXiv Detail & Related papers (2025-06-02T13:19:09Z) - NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results [159.15538432295656]
The NTIRE 2025 image super-resolution ($times$4) challenge is one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025.<n>The challenge aims to recover high-resolution (HR) images from low-resolution (LR) counterparts generated through bicubic downsampling with a $times$4 scaling factor.<n>A total of 286 participants registered for the competition, with 25 teams submitting valid entries.
arXiv Detail & Related papers (2025-04-20T12:08:22Z) - Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.<n>Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z) - MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation [7.552430488883876]
We present a system that adapts the retriever performance to the target domain using a multi-stage tuning strategy.<n>We benchmark the system performance on the dataset released for the RIRAG challenge.<n>We achieve significant performance gains obtaining a top rank on the RegNLP challenge leaderboard.
arXiv Detail & Related papers (2024-12-13T17:53:29Z) - Retrieval-Augmented Generation for Domain-Specific Question Answering: A Case Study on Pittsburgh and CMU [3.1787418271023404]
We designed a Retrieval-Augmented Generation (RAG) system to provide large language models with relevant documents for answering domain-specific questions.
We extracted over 1,800 subpages using a greedy scraping strategy and employed a hybrid annotation process, combining manual and Mistral-generated question-answer pairs.
Our RAG framework integrates BM25 and FAISS retrievers, enhanced with a reranker for improved document retrieval accuracy.
arXiv Detail & Related papers (2024-11-20T20:10:43Z) - Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage [74.70255719194819]
We introduce a novel framework based on sub-question coverage, which measures how well a RAG system addresses different facets of a question.
We use this framework to evaluate three commercial generative answer engines: You.com, Perplexity AI, and Bing Chat.
We find that while all answer engines cover core sub-questions more often than background or follow-up ones, they still miss around 50% of core sub-questions.
arXiv Detail & Related papers (2024-10-20T22:59:34Z) - Toward General Instruction-Following Alignment for Retrieval-Augmented Generation [63.611024451010316]
Following natural instructions is crucial for the effective application of Retrieval-Augmented Generation (RAG) systems.
We propose VIF-RAG, the first automated, scalable, and verifiable synthetic pipeline for instruction-following alignment in RAG systems.
arXiv Detail & Related papers (2024-10-12T16:30:51Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [66.93260816493553]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.<n>With a focus on factual accuracy, we propose three novel metrics: Completeness, Hallucination, and Irrelevance.<n> Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z) - CRAG -- Comprehensive RAG Benchmark [58.15980697921195]
Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge.
Existing RAG datasets do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks.
To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG)
CRAG is a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search.
arXiv Detail & Related papers (2024-06-07T08:43:07Z) - DuetRAG: Collaborative Retrieval-Augmented Generation [57.440772556318926]
Collaborative Retrieval-Augmented Generation framework, DuetRAG, proposed.
bootstrapping philosophy is to simultaneously integrate the domain fintuning and RAG models.
arXiv Detail & Related papers (2024-05-12T09:48:28Z) - Sequencing Matters: A Generate-Retrieve-Generate Model for Building
Conversational Agents [9.191944519634111]
The Georgetown InfoSense group has done in regard to solving the challenges presented by TREC iKAT 2023.
Our submitted runs outperform the median runs by a significant margin, exhibiting superior performance in nDCG across various cut numbers and in overall success rate.
Our solution involves the use of Large Language Models (LLMs) for initial answers, answer grounding by BM25, passage quality filtering by logistic regression, and answer generation by LLMs again.
arXiv Detail & Related papers (2023-11-16T02:37:58Z) - The IMS-CUBoulder System for the SIGMORPHON 2020 Shared Task on
Unsupervised Morphological Paradigm Completion [27.37360427124081]
We present the systems of the University of Stuttgart IMS and the University of Colorado Boulder for SIGMORPHON 2020 Task 2 on unsupervised morphological paradigm completion.
The task consists of generating the morphological paradigms of a set of lemmas, given only the lemmas themselves and unlabeled text.
Our pointer-generator system obtains the best score of all seven submitted systems on average over all languages, and outperforms the official baseline, which was best overall, on Bulgarian and Kannada.
arXiv Detail & Related papers (2020-05-25T21:23:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.