Related papers: Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation

Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation

URL: http://arxiv.org/abs/2310.15515v1
Date: Tue, 24 Oct 2023 04:50:29 GMT
Title: Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation
Authors: Jason Lucas, Adaku Uchendu, Michiharu Yamashita, Jooyoung Lee, Shaurya Rohatgi, Dongwon Lee
Abstract summary: Recent ubiquity and disruptive impacts of large language models (LLMs) have raised concerns about their potential to be misused. We propose a novel "Fighting Fire with Fire" (F3) strategy that harnesses modern LLMs' generative and emergent reasoning capabilities. In our experiments, GPT-3.5-turbo consistently achieved accuracy at 68-72%, unlike the decline observed in previous customized and fine-tuned disinformation detectors.
Score: 7.782551258221384
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Recent ubiquity and disruptive impacts of large language models (LLMs) have raised concerns about their potential to be misused (.i.e, generating large-scale harmful and misleading content). To combat this emerging risk of LLMs, we propose a novel "Fighting Fire with Fire" (F3) strategy that harnesses modern LLMs' generative and emergent reasoning capabilities to counter human-written and LLM-generated disinformation. First, we leverage GPT-3.5-turbo to synthesize authentic and deceptive LLM-generated content through paraphrase-based and perturbation-based prefix-style prompts, respectively. Second, we apply zero-shot in-context semantic reasoning techniques with cloze-style prompts to discern genuine from deceptive posts and news articles. In our extensive experiments, we observe GPT-3.5-turbo's zero-shot superiority for both in-distribution and out-of-distribution datasets, where GPT-3.5-turbo consistently achieved accuracy at 68-72%, unlike the decline observed in previous customized and fine-tuned disinformation detectors. Our codebase and dataset are available at https://github.com/mickeymst/F3.

Related papers

LLM Robustness Against Misinformation in Biomedical Question Answering [50.98256373698759]
The retrieval-augmented generation (RAG) approach is used to reduce the confabulation of large language models (LLMs) for question answering. We evaluate the effectiveness and robustness of four LLMs against misinformation in answering biomedical questions.
arXiv Detail & Related papers (2024-10-27T16:23:26Z)
Are LLMs Good Zero-Shot Fallacy Classifiers? [24.3005882003251]
We focus on leveraging Large Language Models (LLMs) for zero-shot fallacy classification. With comprehensive experiments on benchmark datasets, we suggest that LLMs could be potential zero-shot fallacy classifiers. Our novel multi-round prompting schemes can effectively bring about more improvements, especially for small LLMs.
arXiv Detail & Related papers (2024-10-19T09:38:55Z)
GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input. GIVE guides the LLM agent to select the most pertinent expert data (observe), engage in query-specific divergent thinking (reflect), and then synthesize this information to produce the final output (speak)
arXiv Detail & Related papers (2024-10-11T03:05:06Z)
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints [86.59857711385833]
We introduce RealInstruct, the first benchmark designed to evaluate LLMs' ability to follow real-world multi-constrained instructions. To address the performance gap between open-source and proprietary models, we propose the Decompose, Critique and Refine (DeCRIM) self-correction pipeline. Our results show that DeCRIM improves Mistral's performance by 7.3% on RealInstruct and 8.0% on IFEval even with weak feedback.
arXiv Detail & Related papers (2024-10-09T01:25:10Z)
PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection [26.191836276118696]
We introduce textbfsf PlagBench, a dataset of 46.5K synthetic text pairs that represent three major types of plagiarism. PlagBench is validated through a combination of fine-grained automatic evaluation and human annotation. We show GPT-3.5 Turbo can produce high-quality paraphrases and summaries without significantly increasing text complexity compared to GPT-4 Turbo.
arXiv Detail & Related papers (2024-06-24T03:29:53Z)
Rumour Evaluation with Very Large Language Models [2.6861033447765217]
This work proposes to leverage the advancement of prompting-dependent large language models to combat misinformation. We employ two prompting-based LLM variants to extend the two RumourEval subtasks. For veracity prediction, three classifications schemes are experimented per GPT variant. Each scheme is tested in zero-, one- and few-shot settings. For stance classification, prompting-based-approaches show comparable performance to prior results, with no improvement over finetuning methods.
arXiv Detail & Related papers (2024-04-11T19:38:22Z)
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance [56.04768229686853]
Large Vision-Language Models (LVLMs) tend to hallucinate non-existing objects in the images. We introduce a framework called Mitigating hallucinAtion via classifieR-Free guIdaNcE (MARINE) MARINE is both training-free and API-free, and can effectively and efficiently reduce object hallucinations during the generation process.
arXiv Detail & Related papers (2024-02-13T18:59:05Z)
Fighting Fire with Fire: Adversarial Prompting to Generate a Misinformation Detection Dataset [10.860133543817659]
We propose an LLM-based approach of creating silver-standard ground-truth datasets for identifying misinformation. Specifically speaking, given a trusted news article, our proposed approach involves prompting LLMs to automatically generate a summarised version of the original article. To investigate the usefulness of this dataset, we conduct a set of experiments where we train a range of supervised models for the task of misinformation detection.
arXiv Detail & Related papers (2024-01-09T10:38:13Z)
Prompt Highlighter: Interactive Control for Multi-Modal LLMs [50.830448437285355]
This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation. We introduce a novel inference method, Prompt Highlighter, which enables users to highlight specific prompt spans to interactively control the focus during generation. We find that, during inference, guiding the models with highlighted tokens through the attention weights leads to more desired outputs.
arXiv Detail & Related papers (2023-12-07T13:53:29Z)
DeepInception: Hypnotize Large Language Model to Be Jailbreaker [70.34096187718941]
Large language models (LLMs) have succeeded significantly in various applications but remain susceptible to adversarial jailbreaks. We present a method to take advantage of the LLMs' personification capabilities to construct $textita virtual, nested scene. Empirically, the contents induced by our approach can achieve leading harmfulness rates with previous counterparts.
arXiv Detail & Related papers (2023-11-06T15:29:30Z)
RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models [56.51705482912727]
We present RankVicuna, the first fully open-source LLM capable of performing high-quality listwise reranking in a zero-shot setting. Experimental results on the TREC 2019 and 2020 Deep Learning Tracks show that we can achieve effectiveness comparable to zero-shot reranking with GPT-3.5 with a much smaller 7B parameter model, although our effectiveness remains slightly behind reranking with GPT-4.
arXiv Detail & Related papers (2023-09-26T17:31:57Z)
Semantic Compression With Large Language Models [1.0874100424278175]
Large language models (LLMs) are revolutionizing information retrieval, question answering, summarization, and code generation tasks. LLMs are inherently limited by the number of input and output tokens that can be processed at once. This paper presents three contributions to research on LLMs.
arXiv Detail & Related papers (2023-04-25T01:47:05Z)
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents [56.104476412839944]
Large Language Models (LLMs) have demonstrated remarkable zero-shot generalization across various language-related tasks. This paper investigates generative LLMs for relevance ranking in Information Retrieval (IR) To address concerns about data contamination of LLMs, we collect a new test set called NovelEval. To improve efficiency in real-world applications, we delve into the potential for distilling the ranking capabilities of ChatGPT into small specialized models.
arXiv Detail & Related papers (2023-04-19T10:16:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.