Related papers: RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning Based on Emotional Information

RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning Based on Emotional Information

URL: http://arxiv.org/abs/2406.11093v2
Date: Sat, 31 May 2025 09:54:19 GMT
Title: RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning Based on Emotional Information
Authors: Zhiwei Liu, Kailai Yang, Qianqian Xie, Christine de Kock, Sophia Ananiadou, Eduard Hovy,
Abstract summary: Methods for cross-domain misinformation detection rely on effort- and resource-intensive fine-tuning and complex model structures.<n>We propose RAEmoLLM, the first retrieval augmented (RAG) LLMs framework to address cross-domain misinformation detection using in-context learning based on affective information.<n> RAEmoLLM achieves significant improvements compared to the other few-shot methods on three datasets.
Score: 36.059869205457815
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Misinformation is prevalent in various fields such as education, politics, health, etc., causing significant harm to society. However, current methods for cross-domain misinformation detection rely on effort- and resource-intensive fine-tuning and complex model structures. With the outstanding performance of LLMs, many studies have employed them for misinformation detection. Unfortunately, they focus on in-domain tasks and do not incorporate significant sentiment and emotion features (which we jointly call {\em affect}). In this paper, we propose RAEmoLLM, the first retrieval augmented (RAG) LLMs framework to address cross-domain misinformation detection using in-context learning based on affective information. RAEmoLLM includes three modules. (1) In the index construction module, we apply an emotional LLM to obtain affective embeddings from all domains to construct a retrieval database. (2) The retrieval module uses the database to recommend top K examples (text-label pairs) from source domain data for target domain contents. (3) These examples are adopted as few-shot demonstrations for the inference module to process the target domain content. The RAEmoLLM can effectively enhance the general performance of LLMs in cross-domain misinformation detection tasks through affect-based retrieval, without fine-tuning. We evaluate our framework on three misinformation benchmarks. Results show that RAEmoLLM achieves significant improvements compared to the other few-shot methods on three datasets, with the highest increases of 15.64%, 31.18%, and 15.73% respectively. This project is available at https://github.com/lzw108/RAEmoLLM.

Related papers

Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests. We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments. We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z)
Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation [81.18701211912779]
We introduce an Adaptive Multi-Aspect Retrieval-augmented over KGs (Amar) framework. This method retrieves knowledge including entities, relations, and subgraphs, and converts each piece of retrieved text into prompt embeddings. Our method has achieved state-of-the-art performance on two common datasets.
arXiv Detail & Related papers (2024-12-24T16:38:04Z)
BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition [12.57768435856206]
We propose an approach called Boundary-Aware LLMs for Few-Shot Named Entity Recognition. We introduce a boundary-aware contrastive learning strategy to enhance the LLM's ability to perceive entity boundaries for generalized entity spans. We utilize LoRAHub to align information from the target domain to the source domain, thereby enhancing adaptive cross-domain classification capabilities.
arXiv Detail & Related papers (2024-12-03T07:51:14Z)
Does Unlearning Truly Unlearn? A Black Box Evaluation of LLM Unlearning Methods [1.9799527196428242]
Large language model unlearning aims to remove harmful information that LLMs have learnt to prevent their use for malicious purposes. We show that unlearning has a notable impact on general model capabilities. We show that doing 5-shot prompting or rephrasing the question in simple ways can lead to an over ten-fold increase in accuracy on unlearning benchmarks.
arXiv Detail & Related papers (2024-11-18T22:31:17Z)
Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation [43.630437906898635]
We propose a novel two-stage fine-tuning architecture called Invar-RAG. In the retrieval stage, an LLM-based retriever is constructed by integrating LoRA-based representation learning. In the generation stage, a refined fine-tuning method is employed to improve LLM accuracy in generating answers based on retrieved information.
arXiv Detail & Related papers (2024-11-11T14:25:37Z)
Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift. We devise a series of experiments to empirically explain the performance gap.
arXiv Detail & Related papers (2024-09-27T05:06:43Z)
RUIE: Retrieval-based Unified Information Extraction using Large Language Model [6.788855739199981]
Unified information extraction aims to complete all information extraction tasks using a single model or framework. We propose RUIE (Retrieval-based Unified Information Extraction), a framework that leverages in-context learning to enable rapid generalization. Experimental results on 8 held-out datasets demonstrate RUIE's effectiveness in generalizing to unseen tasks.
arXiv Detail & Related papers (2024-09-18T03:20:04Z)
Task Oriented In-Domain Data Augmentation [38.525017729123114]
Large Language Models (LLMs) have shown superior performance in various applications and fields. To achieve better performance on specialized domains such as law and advertisement, LLMs are often continue pre-trained on in-domain data. We propose TRAIT, a task-oriented in-domain data augmentation framework.
arXiv Detail & Related papers (2024-06-24T14:58:11Z)
R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models [51.468732121824125]
Large language models have achieved remarkable success on general NLP tasks, but they may fall short for domain-specific problems. Existing evaluation tools only provide a few baselines and evaluate them on various domains without mining the depth of domain knowledge. In this paper, we address the challenges of evaluating RALLMs by introducing the R-Eval toolkit, a Python toolkit designed to streamline the evaluation of different RAGs.
arXiv Detail & Related papers (2024-06-17T15:59:49Z)
Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation [66.72195610471624]
Cross-Domain Sequential Recommendation aims to mine and transfer users' sequential preferences across different domains. We propose a novel framework named URLLM, which aims to improve the CDSR performance by exploring the User Retrieval approach.
arXiv Detail & Related papers (2024-06-05T09:19:54Z)
Are you still on track!? Catching LLM Task Drift with Activations [55.75645403965326]
Task drift allows attackers to exfiltrate data or influence the LLM's output for other users. We show that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set. We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions.
arXiv Detail & Related papers (2024-06-02T16:53:21Z)
Zero-Shot Topic Classification of Column Headers: Leveraging LLMs for Metadata Enrichment [0.0]
We propose a method to support metadata enrichment using topic annotations generated by three Large Language Models (LLMs): ChatGPT-3.5, GoogleBard, and GoogleGemini. We evaluate the impact of contextual information (i.e., dataset description) on the classification outcomes.
arXiv Detail & Related papers (2024-03-01T10:01:36Z)
Knowledge Plugins: Enhancing Large Language Models for Domain-Specific Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE. This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z)
Synergistic Interplay between Search and Large Language Models for Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections. InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z)
Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval [55.122020263319634]
Video moment retrieval (VMR) aims to localize the target moment from an untrimmed video according to a given language query. In this paper, we focus on a novel task: cross-domain VMR, where fully-annotated datasets are available in one domain but the domain of interest only contains unannotated datasets. We propose a novel Multi-Modal Cross-Domain Alignment network to transfer the annotation knowledge from the source domain to the target domain.
arXiv Detail & Related papers (2022-09-23T12:58:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.