Related papers: How human judgment impairs automated deception detection performance

How human judgment impairs automated deception detection performance

URL: http://arxiv.org/abs/2003.13316v1
Date: Mon, 30 Mar 2020 10:06:36 GMT
Title: How human judgment impairs automated deception detection performance
Authors: Bennett Kleinberg and Bruno Verschuere
Abstract summary: We tested whether a combination of supervised machine learning and human judgment could improve deception detection accuracy. Human involvement through hybrid-overrule decisions brought the accuracy back to the chance level. The decision-making strategies of humans suggest that the truth bias - the tendency to assume the other is telling the truth - could explain the detrimental effect.
Score: 0.5660207256468972
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Background: Deception detection is a prevalent problem for security practitioners. With a need for more large-scale approaches, automated methods using machine learning have gained traction. However, detection performance still implies considerable error rates. Findings from other domains suggest that hybrid human-machine integrations could offer a viable path in deception detection tasks. Method: We collected a corpus of truthful and deceptive answers about participants' autobiographical intentions (n=1640) and tested whether a combination of supervised machine learning and human judgment could improve deception detection accuracy. Human judges were presented with the outcome of the automated credibility judgment of truthful and deceptive statements. They could either fully overrule it (hybrid-overrule condition) or adjust it within a given boundary (hybrid-adjust condition). Results: The data suggest that in neither of the hybrid conditions did the human judgment add a meaningful contribution. Machine learning in isolation identified truth-tellers and liars with an overall accuracy of 69%. Human involvement through hybrid-overrule decisions brought the accuracy back to the chance level. The hybrid-adjust condition did not deception detection performance. The decision-making strategies of humans suggest that the truth bias - the tendency to assume the other is telling the truth - could explain the detrimental effect. Conclusion: The current study does not support the notion that humans can meaningfully add to the deception detection performance of a machine learning system.

Related papers

Bounded Minds, Generative Machines: Envisioning Conversational AI that Works with Human Heuristics and Reduces Bias Risk [6.879756503058167]
This article outlines a research pathway grounded in rationality, and argues that conversational AI should be designed to work with humans rather than against them.<n>It identifies key directions for detecting cognitive vulnerability, supporting judgment under uncertainty, and evaluating conversational systems beyond factual accuracy, toward decision quality and cognitive robustness.
arXiv Detail & Related papers (2026-01-19T20:23:28Z)
The Evaluation Gap in Medicine, AI and LLMs: Navigating Elusive Ground Truth & Uncertainty via a Probabilistic Paradigm [49.287792149338976]
We introduce a probabilistic paradigm to theoretically explain how high certainty in ground truth answers is almost always necessary for even an expert to achieve high scores.<n>We thus bring forth the concepts of expected accuracy and expected F1 to estimate the score an expert human or system can achieve given ground truth answer variability.
arXiv Detail & Related papers (2026-01-09T03:19:37Z)
Unlocking the power of partnership: How humans and machines can work together to improve face recognition [0.6157382820537719]
We examined the benefits of human-human and human-machine collaborations.<n>We implemented "intelligent human-machine fusion" by selecting people with the potential to increase the accuracy of a high-performing machine.<n>The highest system-wide accuracy achievable with human-only partnerships was found by graph theory.
arXiv Detail & Related papers (2025-10-02T21:19:56Z)
Is Uncertainty Quantification a Viable Alternative to Learned Deferral? [1.533133219129073]
One aspect of AI safety is the models' ability to defer decisions to a human expert.<n>During clinical translation, models often face challenges such as data shift.<n>Uncertainty quantification methods may be a promising choice for AI deferral.
arXiv Detail & Related papers (2025-08-04T11:37:59Z)
AI Debate Aids Assessment of Controversial Claims [86.47978525513236]
We study whether AI debate can guide biased judges toward the truth by having two AI systems debate opposing sides of controversial COVID-19 factuality claims.<n>In our human study, we find that debate-where two AI advisor systems present opposing evidence-based arguments-consistently improves judgment accuracy and confidence calibration.<n>In our AI judge study, we find that AI judges with human-like personas achieve even higher accuracy (78.5%) than human judges (70.1%) and default AI judges without personas (69.8%)
arXiv Detail & Related papers (2025-06-02T19:01:53Z)
ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability [62.285407189502216]
Detecting texts generated by Large Language Models (LLMs) could cause grave mistakes due to incorrect decisions. We introduce ExaGPT, an interpretable detection approach grounded in the human decision-making process. We show that ExaGPT massively outperforms prior powerful detectors by up to +40.9 points of accuracy at a false positive rate of 1%.
arXiv Detail & Related papers (2025-02-17T01:15:07Z)
Effective faking of verbal deception detection with target-aligned adversarial attacks [0.3441021278275805]
Automated adversarial attacks that rewrite deceptive statements to appear truthful pose a serious threat. We used a dataset of 243 truthful and 262 fabricated autobiographical stories in a deception detection task for humans and machine learning models.
arXiv Detail & Related papers (2025-01-10T13:42:40Z)
Online Decision Mediation [72.80902932543474]
Consider learning a decision support assistant to serve as an intermediary between (oracle) expert behavior and (imperfect) human behavior. In clinical diagnosis, fully-autonomous machine behavior is often beyond ethical affordances.
arXiv Detail & Related papers (2023-10-28T05:59:43Z)
Auditing for Human Expertise [12.967730957018688]
We develop a statistical framework under which we can pose this question as a natural hypothesis test. We propose a simple procedure which tests whether expert predictions are statistically independent from the outcomes of interest. A rejection of our test thus suggests that human experts may add value to any algorithm trained on the available data.
arXiv Detail & Related papers (2023-06-02T16:15:24Z)
Who Should Predict? Exact Algorithms For Learning to Defer to Humans [40.22768241509553]
We show that prior approaches can fail to find a human-AI system with low misclassification error. We give a mixed-integer-linear-programming (MILP) formulation that can optimally solve the problem in the linear setting. We provide a novel surrogate loss function that is realizable-consistent and performs well empirically.
arXiv Detail & Related papers (2023-01-15T21:57:36Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent. Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally. We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z)
Probabilistic Human Motion Prediction via A Bayesian Neural Network [71.16277790708529]
We propose a probabilistic model for human motion prediction in this paper. Our model could generate several future motions when given an observed motion sequence. We extensively validate our approach on a large scale benchmark dataset Human3.6m.
arXiv Detail & Related papers (2021-07-14T09:05:33Z)
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores [19.71360639210631]
We conduct the first, large-scale user study on 320 lay and 11 expert users to shed light on the effectiveness of state-of-the-art attribution methods. We found that, in overall, feature attribution is surprisingly not more effective than showing humans nearest training-set examples.
arXiv Detail & Related papers (2021-05-31T13:23:50Z)
On complementing end-to-end human motion predictors with planning [31.025766804649464]
High capacity end-to-end approaches for human motion prediction have the ability to represent subtle nuances in human behavior, but struggle with robustness to out of distribution inputs and tail events. Planning-based prediction, on the other hand, can reliably output decent-but-not-great predictions.
arXiv Detail & Related papers (2021-03-09T19:02:45Z)
Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs [90.20235972293801]
Aiming to understand how human (false-temporal)-belief-a core socio-cognitive ability unify-would affect human interactions with robots, this paper proposes to adopt a graphical model to the representation of object states, robot knowledge, and human (false-)beliefs. An inference algorithm is derived to fuse individual pg from all robots across multi-views into a joint pg, which affords more effective reasoning inference capability to overcome the errors originated from a single view.
arXiv Detail & Related papers (2020-04-25T23:02:04Z)
A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores [85.12096045419686]
We study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We first show that humans do alter their behavior when the tool is deployed. We show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk.
arXiv Detail & Related papers (2020-02-19T07:27:32Z)
Deceptive AI Explanations: Creation and Detection [3.197020142231916]
We investigate how AI models can be used to create and detect deceptive explanations. As an empirical evaluation, we focus on text classification and alter the explanations generated by GradCAM. We evaluate the effect of deceptive explanations on users in an experiment with 200 participants.
arXiv Detail & Related papers (2020-01-21T16:41:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.