Related papers: Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training

Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training

URL: http://arxiv.org/abs/2405.20978v1
Date: Fri, 31 May 2024 16:24:53 GMT
Title: Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training
Authors: Feiteng Fang, Yuelin Bai, Shiwen Ni, Min Yang, Xiaojun Chen, Ruifeng Xu,
Abstract summary: Large Language Models (LLMs) exhibit substantial capabilities yet encounter challenges, including hallucination, outdated knowledge, and untraceable reasoning processes. Retrieval-augmented generation (RAG) has emerged as a promising solution, integrating knowledge from external databases to mitigate these challenges. We propose a novel RAG approach known as Retrieval-augmented Adaptive Adrial Training (RAAT)
Score: 39.21885486667879
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) exhibit substantial capabilities yet encounter challenges, including hallucination, outdated knowledge, and untraceable reasoning processes. Retrieval-augmented generation (RAG) has emerged as a promising solution, integrating knowledge from external databases to mitigate these challenges. However, inappropriate retrieved passages can potentially hinder the LLMs' capacity to generate comprehensive and high-quality responses. Prior RAG studies on the robustness of retrieval noises often confine themselves to a limited set of noise types, deviating from real-world retrieval environments and limiting practical applicability. In this study, we initially investigate retrieval noises and categorize them into three distinct types, reflecting real-world environments. We analyze the impact of these various retrieval noises on the robustness of LLMs. Subsequently, we propose a novel RAG approach known as Retrieval-augmented Adaptive Adversarial Training (RAAT). RAAT leverages adaptive adversarial training to dynamically adjust the model's training process in response to retrieval noises. Concurrently, it employs multi-task learning to ensure the model's capacity to internally recognize noisy contexts. Extensive experiments demonstrate that the LLaMA-2 7B model trained using RAAT exhibits significant improvements in F1 and EM scores under diverse noise conditions. For reproducibility, we release our code and data at: https://github.com/calubkk/RAAT.

Related papers

Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLM-Powered Assistance [17.359530437698723]
Learning from noisy labels (LNL) is a challenge that arises in many real-world scenarios where collected training data can contain incorrect or corrupted labels. Most existing solutions identify noisy labels and adopt active learning to query human experts on them for denoising. In this paper, we propose an innovative collaborative learning framework NoiseAL based on active learning to combine large language models with small models for learning from noisy labels.
arXiv Detail & Related papers (2025-04-03T04:36:39Z)
Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment [56.87031484108484]
Large Language Models (LLMs) are increasingly recognized for their practical applications. Retrieval-Augmented Generation (RAG) tackles this challenge and has shown a significant impact on LLMs. By minimizing retrieval requests that yield neutral or harmful results, we can effectively reduce both time and computational costs.
arXiv Detail & Related papers (2024-11-09T15:12:28Z)
Towards Building a Robust Knowledge Intensive Question Answering Model with Large Language Models [4.4849006637642805]
Presence of noise and errors in retrieved information poses challenges to the robustness of LLMs. To address the issue of model accuracy decline caused by noisy external information, we propose a data augmentation-based fine-tuning method. We have conducted experiments on both existing LLMs and our approach, the results are evaluated by GPT-4.
arXiv Detail & Related papers (2024-09-09T07:32:30Z)
Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models [25.850830204451363]
Retrieval-Augmented Generation (RAG) has emerged as a crucial method for addressing hallucinations in large language models (LLMs) In this paper, we define seven distinct noise types from a linguistic perspective and establish a Noise RAG Benchmark (NoiserBench) Our analysis offers insights for developing more robust, adaptable RAG solutions and mitigating hallucinations across diverse retrieval scenarios.
arXiv Detail & Related papers (2024-08-24T09:23:01Z)
MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced Reranking and Noise-injected Training [9.023648972811458]
RagVL is a novel framework with knowledge-enhanced reranking and noise-injected training. We instruction-tune the MLLM with a simple yet effective instruction template to induce its ranking ability. For generation, we inject visual noise during training at the data and token levels to enhance the generator's robustness.
arXiv Detail & Related papers (2024-07-31T08:43:17Z)
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales [14.655518998487237]
We propose InstructRAG, where LMs explicitly learn the denoising process through self-synthesized rationales. InstructRAG requires no additional supervision, allows for easier verification of the predicted answers. Experiments show InstructRAG consistently outperforms existing RAG methods in both training-free and trainable scenarios.
arXiv Detail & Related papers (2024-06-19T15:25:29Z)
ROPO: Robust Preference Optimization for Large Language Models [59.10763211091664]
We propose an iterative alignment approach that integrates noise-tolerance and filtering of noisy samples without the aid of external models. Experiments on three widely-used datasets with Mistral-7B and Llama-2-7B demonstrate that ROPO significantly outperforms existing preference alignment methods.
arXiv Detail & Related papers (2024-04-05T13:58:51Z)
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR) In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER. Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z)
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases. We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets. Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z)
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG) Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection. It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z)
Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task [18.623619585980688]
We propose a unified robustness evaluation framework based on the slot-filling task to evaluate the dialogue understanding capability of large language models. Specifically, we construct a input perturbation evaluation dataset, Noise-LLM, which contains five types of single perturbation and four types of mixed perturbation data. Our aim is to assess how well various robustness methods of LLMs perform in real-world noisy scenarios.
arXiv Detail & Related papers (2023-10-10T10:22:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.