SEPSIS: I Can Catch Your Lies -- A New Paradigm for Deception Detection
- URL: http://arxiv.org/abs/2312.00292v1
- Date: Fri, 1 Dec 2023 02:13:25 GMT
- Title: SEPSIS: I Can Catch Your Lies -- A New Paradigm for Deception Detection
- Authors: Anku Rani, Dwip Dalal, Shreya Gautam, Pankaj Gupta, Vinija Jain, Aman
Chadha, Amit Sheth, Amitava Das
- Abstract summary: This research explores the problem of deception through the lens of psychology.
We propose a novel framework for deception detection leveraging NLP techniques.
We present a novel multi-task learning pipeline that leverages the dataless merging of fine-tuned language models.
- Score: 9.20397189600732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deception is the intentional practice of twisting information. It is a
nuanced societal practice deeply intertwined with human societal evolution,
characterized by a multitude of facets. This research explores the problem of
deception through the lens of psychology, employing a framework that
categorizes deception into three forms: lies of omission, lies of commission,
and lies of influence. The primary focus of this study is specifically on
investigating only lies of omission. We propose a novel framework for deception
detection leveraging NLP techniques. We curated an annotated dataset of 876,784
samples by amalgamating a popular large-scale fake news dataset and scraped
news headlines from the Twitter handle of Times of India, a well-known Indian
news media house. Each sample has been labeled with four layers, namely: (i)
the type of omission (speculation, bias, distortion, sounds factual, and
opinion), (ii) colors of lies(black, white, etc), and (iii) the intention of
such lies (to influence, etc) (iv) topic of lies (political, educational,
religious, etc). We present a novel multi-task learning pipeline that leverages
the dataless merging of fine-tuned language models to address the deception
detection task mentioned earlier. Our proposed model achieved an F1 score of
0.87, demonstrating strong performance across all layers including the type,
color, intent, and topic aspects of deceptive content. Finally, our research
explores the relationship between lies of omission and propaganda techniques.
To accomplish this, we conducted an in-depth analysis, uncovering compelling
findings. For instance, our analysis revealed a significant correlation between
loaded language and opinion, shedding light on their interconnectedness. To
encourage further research in this field, we will be making the models and
dataset available with the MIT License, making it favorable for open-source
research.
Related papers
- PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent [71.20471076045916]
Propaganda plays a critical role in shaping public opinion and fueling disinformation.
Propainsight systematically dissects propaganda into techniques, arousal appeals, and underlying intent.
Propagaze combines human-annotated data with high-quality synthetic data.
arXiv Detail & Related papers (2024-09-19T06:28:18Z) - Exploring the Deceptive Power of LLM-Generated Fake News: A Study of Real-World Detection Challenges [21.425647152424585]
We propose a strong fake news attack method called conditional Variational-autoencoder-Like Prompt (VLPrompt)
Unlike current methods, VLPrompt eliminates the need for additional data collection while maintaining contextual coherence.
Our experiments, including various detection methods and novel human study metrics, were conducted to assess their performance on our dataset.
arXiv Detail & Related papers (2024-03-27T04:39:18Z) - Can Large Language Models Detect Misinformation in Scientific News
Reporting? [1.0344642971058586]
This paper investigates whether it is possible to use large language models (LLMs) to detect misinformation in scientific reporting.
We first present a new labeled dataset SciNews, containing 2.4k scientific news stories drawn from trusted and untrustworthy sources.
We identify dimensions of scientific validity in science news articles and explore how this can be integrated into the automated detection of scientific misinformation.
arXiv Detail & Related papers (2024-02-22T04:07:00Z) - Analyzing the Impact of Fake News on the Anticipated Outcome of the 2024
Election Ahead of Time [7.1970442944315245]
Despite increasing awareness and research around fake news, there is still a significant need for datasets that specifically target racial slurs and biases within North American political speeches.
This study introduces a comprehensive dataset that illuminates these critical aspects of misinformation.
arXiv Detail & Related papers (2023-12-01T20:14:16Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - Rumor Detection with Self-supervised Learning on Texts and Social Graph [101.94546286960642]
We propose contrastive self-supervised learning on heterogeneous information sources, so as to reveal their relations and characterize rumors better.
We term this framework as Self-supervised Rumor Detection (SRD)
Extensive experiments on three real-world datasets validate the effectiveness of SRD for automatic rumor detection on social media.
arXiv Detail & Related papers (2022-04-19T12:10:03Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Mining Fine-grained Semantics via Graph Neural Networks for
Evidence-based Fake News Detection [20.282527436527765]
We propose a unified Graph-based sEmantic sTructure mining framework, namely GET in short.
We model claims and evidences as graph-structured data and capture the long-distance semantic dependency.
After obtaining contextual semantic information, our model reduces information redundancy by performing graph structure learning.
arXiv Detail & Related papers (2022-01-18T11:28:36Z) - Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions.
To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles.
In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z) - Machine Learning Explanations to Prevent Overtrust in Fake News
Detection [64.46876057393703]
This research investigates the effects of an Explainable AI assistant embedded in news review platforms for combating the propagation of fake news.
We design a news reviewing and sharing interface, create a dataset of news stories, and train four interpretable fake news detection algorithms.
For a deeper understanding of Explainable AI systems, we discuss interactions between user engagement, mental model, trust, and performance measures in the process of explaining.
arXiv Detail & Related papers (2020-07-24T05:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.