Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation
- URL: http://arxiv.org/abs/2403.14952v1
- Date: Fri, 22 Mar 2024 05:05:45 GMT
- Title: Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation
- Authors: Zhenrui Yue, Huimin Zeng, Yimeng Lu, Lanyu Shang, Yang Zhang, Dong Wang,
- Abstract summary: We propose retrieval augmented response generation for online misinformation (RARG)
RARG collects supporting evidence from scientific sources and generates counter-misinformation responses based on the evidences.
We propose a reward function to maximize the utilization of the retrieved evidence while maintaining the quality of the generated text.
- Score: 18.18205773056388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The proliferation of online misinformation has posed significant threats to public interest. While numerous online users actively participate in the combat against misinformation, many of such responses can be characterized by the lack of politeness and supporting facts. As a solution, text generation approaches are proposed to automatically produce counter-misinformation responses. Nevertheless, existing methods are often trained end-to-end without leveraging external knowledge, resulting in subpar text quality and excessively repetitive responses. In this paper, we propose retrieval augmented response generation for online misinformation (RARG), which collects supporting evidence from scientific sources and generates counter-misinformation responses based on the evidences. In particular, our RARG consists of two stages: (1) evidence collection, where we design a retrieval pipeline to retrieve and rerank evidence documents using a database comprising over 1M academic articles; (2) response generation, in which we align large language models (LLMs) to generate evidence-based responses via reinforcement learning from human feedback (RLHF). We propose a reward function to maximize the utilization of the retrieved evidence while maintaining the quality of the generated text, which yields polite and factual responses that clearly refutes misinformation. To demonstrate the effectiveness of our method, we study the case of COVID-19 and perform extensive experiments with both in- and cross-domain datasets, where RARG consistently outperforms baselines by generating high-quality counter-misinformation responses.
Related papers
- Contrastive Learning to Improve Retrieval for Real-world Fact Checking [84.57583869042791]
We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims.
We leverage the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents.
We find a 6% improvement in veracity classification accuracy on the dataset.
arXiv Detail & Related papers (2024-10-07T00:09:50Z) - Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments [23.639378586798884]
We propose retrieval augmented fact verification through the synthesis of contrasting arguments.
Our method effectively retrieves relevant documents as evidence and evaluates arguments from varying perspectives.
We demonstrate the effectiveness of our method through extensive experiments, where RAFTS can outperform GPT-based methods with a significantly smaller 7B LLM.
arXiv Detail & Related papers (2024-06-14T08:13:34Z) - Understanding Retrieval Augmentation for Long-Form Question Answering [44.19142029392175]
We present a study of retrieval-augmented language models (LMs) on long-form question answering.
We analyze how retrieval augmentation impacts different LMs, by comparing answers generated from models while using the same evidence documents.
arXiv Detail & Related papers (2023-10-18T17:59:10Z) - Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG)
Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection.
It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z) - Promoting Open-domain Dialogue Generation through Learning Pattern
Information between Contexts and Responses [5.936682548344234]
This paper improves the quality of generated responses by learning the implicit pattern information between contexts and responses in the training samples.
We also design a response-aware mechanism for mining the implicit pattern information between contexts and responses so that the generated replies are more diverse and approximate to human replies.
arXiv Detail & Related papers (2023-09-06T08:11:39Z) - Reinforcement Learning-based Counter-Misinformation Response Generation:
A Case Study of COVID-19 Vaccine Misinformation [19.245814221211415]
Non-expert ordinary users act as eyes-on-the-ground who proactively counter misinformation.
In this work, we create two novel datasets of misinformation and counter-misinformation response pairs.
We propose MisinfoCorrect, a reinforcement learning-based framework that learns to generate counter-misinformation responses.
arXiv Detail & Related papers (2023-03-11T15:55:01Z) - Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue
Systems [71.33737787564966]
End-to-end (E2E) task-oriented dialogue (ToD) systems are prone to fall into the so-called 'likelihood trap'
We propose a reranking method which aims to select high-quality items from the lists of responses initially overgenerated by the system.
Our methods improve a state-of-the-art E2E ToD system by 2.4 BLEU, 3.2 ROUGE, and 2.8 METEOR scores, achieving new peak results.
arXiv Detail & Related papers (2022-11-07T15:59:49Z) - Rome was built in 1776: A Case Study on Factual Correctness in
Knowledge-Grounded Response Generation [18.63673852470077]
We present a human annotation setup to identify three different response types.
We automatically create a new corpus called Conv-FEVER that is adapted from the Wizard of Wikipedia dataset.
arXiv Detail & Related papers (2021-10-11T17:48:11Z) - REM-Net: Recursive Erasure Memory Network for Commonsense Evidence
Refinement [130.8875535449478]
REM-Net is equipped with a module to refine the evidence by erasing the low-quality evidence that does not explain the question answering.
Instead of retrieving evidence from existing knowledge bases, REM-Net leverages a pre-trained generative model to generate candidate evidence customized for the question.
The results demonstrate the performance of REM-Net and show that the refined evidence is explainable.
arXiv Detail & Related papers (2020-12-24T10:07:32Z) - A Controllable Model of Grounded Response Generation [122.7121624884747]
Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process.
We propose a framework that we call controllable grounded response generation (CGRG)
We show that using this framework, a transformer based model with a novel inductive attention mechanism, trained on a conversation-like Reddit dataset, outperforms strong generation baselines.
arXiv Detail & Related papers (2020-05-01T21:22:08Z) - Counterfactual Off-Policy Training for Neural Response Generation [94.76649147381232]
We propose to explore potential responses by counterfactual reasoning.
Training on the counterfactual responses under the adversarial learning framework helps to explore the high-reward area of the potential response space.
An empirical study on the DailyDialog dataset shows that our approach significantly outperforms the HRED model.
arXiv Detail & Related papers (2020-04-29T22:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.