Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting
Elusive Disinformation
- URL: http://arxiv.org/abs/2310.15515v1
- Date: Tue, 24 Oct 2023 04:50:29 GMT
- Title: Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting
Elusive Disinformation
- Authors: Jason Lucas, Adaku Uchendu, Michiharu Yamashita, Jooyoung Lee, Shaurya
Rohatgi, Dongwon Lee
- Abstract summary: Recent ubiquity and disruptive impacts of large language models (LLMs) have raised concerns about their potential to be misused.
We propose a novel "Fighting Fire with Fire" (F3) strategy that harnesses modern LLMs' generative and emergent reasoning capabilities.
In our experiments, GPT-3.5-turbo consistently achieved accuracy at 68-72%, unlike the decline observed in previous customized and fine-tuned disinformation detectors.
- Score: 7.782551258221384
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recent ubiquity and disruptive impacts of large language models (LLMs) have
raised concerns about their potential to be misused (.i.e, generating
large-scale harmful and misleading content). To combat this emerging risk of
LLMs, we propose a novel "Fighting Fire with Fire" (F3) strategy that harnesses
modern LLMs' generative and emergent reasoning capabilities to counter
human-written and LLM-generated disinformation. First, we leverage
GPT-3.5-turbo to synthesize authentic and deceptive LLM-generated content
through paraphrase-based and perturbation-based prefix-style prompts,
respectively. Second, we apply zero-shot in-context semantic reasoning
techniques with cloze-style prompts to discern genuine from deceptive posts and
news articles. In our extensive experiments, we observe GPT-3.5-turbo's
zero-shot superiority for both in-distribution and out-of-distribution
datasets, where GPT-3.5-turbo consistently achieved accuracy at 68-72%, unlike
the decline observed in previous customized and fine-tuned disinformation
detectors. Our codebase and dataset are available at
https://github.com/mickeymst/F3.
Related papers
- LLM Robustness Against Misinformation in Biomedical Question Answering [50.98256373698759]
The retrieval-augmented generation (RAG) approach is used to reduce the confabulation of large language models (LLMs) for question answering.
We evaluate the effectiveness and robustness of four LLMs against misinformation in answering biomedical questions.
arXiv Detail & Related papers (2024-10-27T16:23:26Z) - Are LLMs Good Zero-Shot Fallacy Classifiers? [24.3005882003251]
We focus on leveraging Large Language Models (LLMs) for zero-shot fallacy classification.
With comprehensive experiments on benchmark datasets, we suggest that LLMs could be potential zero-shot fallacy classifiers.
Our novel multi-round prompting schemes can effectively bring about more improvements, especially for small LLMs.
arXiv Detail & Related papers (2024-10-19T09:38:55Z) - GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input.
GIVE guides the LLM agent to select the most pertinent expert data (observe), engage in query-specific divergent thinking (reflect), and then synthesize this information to produce the final output (speak)
arXiv Detail & Related papers (2024-10-11T03:05:06Z) - PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection [26.191836276118696]
We introduce textbfsf PlagBench, a dataset of 46.5K synthetic text pairs that represent three major types of plagiarism.
PlagBench is validated through a combination of fine-grained automatic evaluation and human annotation.
We show GPT-3.5 Turbo can produce high-quality paraphrases and summaries without significantly increasing text complexity compared to GPT-4 Turbo.
arXiv Detail & Related papers (2024-06-24T03:29:53Z) - Rumour Evaluation with Very Large Language Models [2.6861033447765217]
This work proposes to leverage the advancement of prompting-dependent large language models to combat misinformation.
We employ two prompting-based LLM variants to extend the two RumourEval subtasks.
For veracity prediction, three classifications schemes are experimented per GPT variant. Each scheme is tested in zero-, one- and few-shot settings.
For stance classification, prompting-based-approaches show comparable performance to prior results, with no improvement over finetuning methods.
arXiv Detail & Related papers (2024-04-11T19:38:22Z) - Mitigating Object Hallucination in Large Vision-Language Models via
Classifier-Free Guidance [56.04768229686853]
Large Vision-Language Models (LVLMs) tend to hallucinate non-existing objects in the images.
We introduce a framework called Mitigating hallucinAtion via classifieR-Free guIdaNcE (MARINE)
MARINE is both training-free and API-free, and can effectively and efficiently reduce object hallucinations during the generation process.
arXiv Detail & Related papers (2024-02-13T18:59:05Z) - Fighting Fire with Fire: Adversarial Prompting to Generate a
Misinformation Detection Dataset [10.860133543817659]
We propose an LLM-based approach of creating silver-standard ground-truth datasets for identifying misinformation.
Specifically speaking, given a trusted news article, our proposed approach involves prompting LLMs to automatically generate a summarised version of the original article.
To investigate the usefulness of this dataset, we conduct a set of experiments where we train a range of supervised models for the task of misinformation detection.
arXiv Detail & Related papers (2024-01-09T10:38:13Z) - DeepInception: Hypnotize Large Language Model to Be Jailbreaker [70.34096187718941]
Large language models (LLMs) have succeeded significantly in various applications but remain susceptible to adversarial jailbreaks.
We present a method to take advantage of the LLMs' personification capabilities to construct $textita virtual, nested scene.
Empirically, the contents induced by our approach can achieve leading harmfulness rates with previous counterparts.
arXiv Detail & Related papers (2023-11-06T15:29:30Z) - RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large
Language Models [56.51705482912727]
We present RankVicuna, the first fully open-source LLM capable of performing high-quality listwise reranking in a zero-shot setting.
Experimental results on the TREC 2019 and 2020 Deep Learning Tracks show that we can achieve effectiveness comparable to zero-shot reranking with GPT-3.5 with a much smaller 7B parameter model, although our effectiveness remains slightly behind reranking with GPT-4.
arXiv Detail & Related papers (2023-09-26T17:31:57Z) - Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents [53.78782375511531]
Large Language Models (LLMs) have demonstrated remarkable zero-shot generalization across various language-related tasks.
This paper investigates generative LLMs for relevance ranking in Information Retrieval (IR)
To address concerns about data contamination of LLMs, we collect a new test set called NovelEval.
To improve efficiency in real-world applications, we delve into the potential for distilling the ranking capabilities of ChatGPT into small specialized models.
arXiv Detail & Related papers (2023-04-19T10:16:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.