To Tell The Truth: Language of Deception and Language Models
- URL: http://arxiv.org/abs/2311.07092v3
- Date: Mon, 8 Apr 2024 05:54:40 GMT
- Title: To Tell The Truth: Language of Deception and Language Models
- Authors: Sanchaita Hazra, Bodhisattwa Prasad Majumder,
- Abstract summary: We analyze a novel TV game show data where conversations in a high-stake environment result in lies.
We investigate the manifestation of potentially verifiable language cues of deception in the presence of objective truth.
We show that there exists a class of detectors (algorithms) that have similar truth detection performance compared to human subjects.
- Score: 6.80186731352488
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-based misinformation permeates online discourses, yet evidence of people's ability to discern truth from such deceptive textual content is scarce. We analyze a novel TV game show data where conversations in a high-stake environment between individuals with conflicting objectives result in lies. We investigate the manifestation of potentially verifiable language cues of deception in the presence of objective truth, a distinguishing feature absent in previous text-based deception datasets. We show that there exists a class of detectors (algorithms) that have similar truth detection performance compared to human subjects, even when the former accesses only the language cues while the latter engages in conversations with complete access to all potential sources of cues (language and audio-visual). Our model, built on a large language model, employs a bottleneck framework to learn discernible cues to determine truth, an act of reasoning in which human subjects often perform poorly, even with incentives. Our model detects novel but accurate language cues in many cases where humans failed to detect deception, opening up the possibility of humans collaborating with algorithms and ameliorating their ability to detect the truth.
Related papers
- Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Learning Phonotactics from Linguistic Informants [54.086544221761486]
Our model iteratively selects or synthesizes a data-point according to one of a range of information-theoretic policies.
We find that the information-theoretic policies that our model uses to select items to query the informant achieve sample efficiency comparable to, or greater than, fully supervised approaches.
arXiv Detail & Related papers (2024-05-08T00:18:56Z) - A Roadmap for Multilingual, Multimodal Domain Independent Deception Detection [2.1506382989223782]
Deception, a prevalent aspect of human communication, has undergone a significant transformation in the digital age.
Recent studies have shown the possibility of the existence of universal linguistic cues to deception across domains within the English language.
The practical task of deception detection in low-resource languages is not a well-studied problem due to the lack of labeled data.
arXiv Detail & Related papers (2024-05-07T00:38:34Z) - Language Generation from Brain Recordings [68.97414452707103]
We propose a generative language BCI that utilizes the capacity of a large language model and a semantic brain decoder.
The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli.
Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
arXiv Detail & Related papers (2023-11-16T13:37:21Z) - Contextual information integration for stance detection via
cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target.
Most existing stance detection models are limited because they do not consider relevant contextual information.
We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z) - Entailment Semantics Can Be Extracted from an Ideal Language Model [32.5500309433108]
We prove that entailment judgments between sentences can be extracted from an ideal language model.
We also show entailment judgments can be decoded from the predictions of a language model trained on such Gricean data.
arXiv Detail & Related papers (2022-09-26T04:16:02Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Bridging the Gap: Using Deep Acoustic Representations to Learn Grounded
Language from Percepts and Raw Speech [26.076534338576234]
Learning to understand grounded language, which connects natural language to percepts, is a critical research area.
In this work we demonstrate the feasibility of performing grounded language acquisition on paired visual percepts and raw speech inputs.
arXiv Detail & Related papers (2021-12-27T16:12:30Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - The Rediscovery Hypothesis: Language Models Need to Meet Linguistics [8.293055016429863]
We study whether linguistic knowledge is a necessary condition for good performance of modern language models.
We show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures.
This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objective with linguistic information.
arXiv Detail & Related papers (2021-03-02T15:57:39Z) - Words are the Window to the Soul: Language-based User Representations
for Fake News Detection [5.876243339384605]
We introduce a model that creates representations of individuals on social media based only on the language they produce.
We show that language-based user representations are beneficial for this task.
arXiv Detail & Related papers (2020-11-14T21:14:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.