When lies are mostly truthful: automated verbal deception detection for embedded lies
- URL: http://arxiv.org/abs/2501.07217v1
- Date: Mon, 13 Jan 2025 11:16:05 GMT
- Title: When lies are mostly truthful: automated verbal deception detection for embedded lies
- Authors: Riccardo Loconte, Bennett Kleinberg,
- Abstract summary: We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies.
We show that a fined-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies with 64% accuracy.
- Score: 0.3867363075280544
- License:
- Abstract: Background: Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. Methods: We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a within-subjects design, participants provided a truthful account of an autobiographical event. They then rewrote their statement in a deceptive manner by including embedded lies, which they highlighted afterwards and judged on lie centrality, deceptiveness, and source. Results: We show that a fined-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies with 64% accuracy. Individual differences, linguistic properties and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences with their truthful counterparts. Conclusion: We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.
Related papers
- How Entangled is Factuality and Deception in German? [10.790059579736276]
Research on deception detection and fact checking often conflates factual accuracy with the truthfulness of statements.
The belief-based deception framework disentangles these properties by defining texts as deceptive when there is a mismatch between what people say and what they truly believe.
We test the effectiveness of computational models in detecting deception using an established corpus of belief-based argumentation.
arXiv Detail & Related papers (2024-09-30T10:23:13Z) - Grounding Fallacies Misrepresenting Scientific Publications in Evidence [84.32990746227385]
We introduce MissciPlus, an extension of the fallacy detection dataset Missci.
MissciPlus pairs the real-world misrepresented evidence with incorrect claims, identical to the input to evidence-based fact-checking models.
Our findings show that current fact-checking models struggle to use misrepresented scientific passages to refute misinformation.
arXiv Detail & Related papers (2024-08-23T03:16:26Z) - Missci: Reconstructing Fallacies in Misrepresented Science [84.32990746227385]
Health-related misinformation on social networks can lead to poor decision-making and real-world dangers.
Missci is a novel argumentation theoretical model for fallacious reasoning.
We present Missci as a dataset to test the critical reasoning abilities of large language models.
arXiv Detail & Related papers (2024-06-05T12:11:10Z) - Truth-Aware Context Selection: Mitigating Hallucinations of Large Language Models Being Misled by Untruthful Contexts [31.769428095250912]
Large Language Models (LLMs) are easily misled by untruthful contexts provided by users or knowledge augmentation tools.
We propose Truth-Aware Context Selection (TACS) to adaptively recognize and mask untruthful context from the inputs.
We show that TACS can effectively filter untruthful context and significantly improve the overall quality of LLMs' responses when presented with misleading information.
arXiv Detail & Related papers (2024-03-12T11:40:44Z) - Cognitive Dissonance: Why Do Language Model Outputs Disagree with
Internal Representations of Truthfulness? [53.98071556805525]
Neural language models (LMs) can be used to evaluate the truth of factual statements.
They can be queried for statement probabilities, or probed for internal representations of truthfulness.
Past work has found that these two procedures sometimes disagree, and that probes tend to be more accurate than LM outputs.
This has led some researchers to conclude that LMs "lie" or otherwise encode non-cooperative communicative intents.
arXiv Detail & Related papers (2023-11-27T18:59:14Z) - To Tell The Truth: Language of Deception and Language Models [6.80186731352488]
We analyze a novel TV game show data where conversations in a high-stake environment result in lies.
We investigate the manifestation of potentially verifiable language cues of deception in the presence of objective truth.
We show that there exists a class of detectors (algorithms) that have similar truth detection performance compared to human subjects.
arXiv Detail & Related papers (2023-11-13T05:40:11Z) - Lost in Translation -- Multilingual Misinformation and its Evolution [52.07628580627591]
This paper investigates the prevalence and dynamics of multilingual misinformation through an analysis of over 250,000 unique fact-checks spanning 95 languages.
We find that while the majority of misinformation claims are only fact-checked once, 11.7%, corresponding to more than 21,000 claims, are checked multiple times.
Using fact-checks as a proxy for the spread of misinformation, we find 33% of repeated claims cross linguistic boundaries.
arXiv Detail & Related papers (2023-10-27T12:21:55Z) - Truth Machines: Synthesizing Veracity in AI Language Models [0.0]
We discuss the struggle for truth in AI systems and the general responses to date.
It then investigates the production of truth in InstructGPT, a large language model.
We argue that these same logics and inconsistencies play out in ChatGPT, reiterating truth as a non-trivial problem.
arXiv Detail & Related papers (2023-01-28T02:47:50Z) - Machine Learning based Lie Detector applied to a Collected and Annotated
Dataset [1.3007851628964147]
We have collected a dataset that contains annotated images and 3D information of different participants faces during a card game that incentivises the lying.
Using our collected dataset, we evaluated several types of machine learning based lie detector through generalize, personal and cross lie experiments.
In these experiments, we showed the superiority of deep learning based model in recognizing the lie with best accuracy of 57% for generalized task and 63% when dealing with a single participant.
arXiv Detail & Related papers (2021-04-26T04:48:42Z) - Did they answer? Subjective acts and intents in conversational discourse [48.63528550837949]
We present the first discourse dataset with multiple and subjective interpretations of English conversation.
We show disagreements are nuanced and require a deeper understanding of the different contextual factors.
arXiv Detail & Related papers (2021-04-09T16:34:19Z) - AmbiFC: Fact-Checking Ambiguous Claims with Evidence [57.7091560922174]
We present AmbiFC, a fact-checking dataset with 10k claims derived from real-world information needs.
We analyze disagreements arising from ambiguity when comparing claims against evidence in AmbiFC.
We develop models for predicting veracity handling this ambiguity via soft labels.
arXiv Detail & Related papers (2021-04-01T17:40:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.