Related papers: The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection

The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection

URL: http://arxiv.org/abs/2308.12215v3
Date: Wed, 19 Jun 2024 20:33:53 GMT
Title: The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection
Authors: Madelyne Xiao, Jonathan Mayer,
Abstract summary: We examine the disconnect between scholarship and practice in applying machine learning to trust and safety problems. We survey literature on automated detection of misinformation across a corpus of 248 well-cited papers in the field. We conclude that the current state-of-the-art in fully-automated detection has limited efficacy in detecting human-generated misinformation.
Score: 0.8057006406834466
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We examine the disconnect between scholarship and practice in applying machine learning to trust and safety problems, using misinformation detection as a case study. We survey literature on automated detection of misinformation across a corpus of 248 well-cited papers in the field. We then examine subsets of papers for data and code availability, design missteps, reproducibility, and generalizability. Our paper corpus includes published work in security, natural language processing, and computational social science. Across these disparate disciplines, we identify common errors in dataset and method design. In general, detection tasks are often meaningfully distinct from the challenges that online services actually face. Datasets and model evaluation are often non-representative of real-world contexts, and evaluation frequently is not independent of model training. We demonstrate the limitations of current detection methods in a series of three representative replication studies. Based on the results of these analyses and our literature survey, we conclude that the current state-of-the-art in fully-automated misinformation detection has limited efficacy in detecting human-generated misinformation. We offer recommendations for evaluating applications of machine learning to trust and safety problems and recommend future directions for research.

Related papers

PaperAudit-Bench: Benchmarking Error Detection in Research Papers for Critical Automated Peer Review [54.141490756509306]
We introduce PaperAudit-Bench, which consists of two components: PaperAudit-Dataset, an error dataset, and PaperAudit-Review, an automated review framework.<n>Experiments on PaperAudit-Bench reveal large variability in error detectability across models and detection depths.<n>We show that the dataset supports training lightweight LLM detectors via SFT and RL, enabling effective error detection at reduced computational cost.
arXiv Detail & Related papers (2026-01-07T04:26:12Z)
An Investigation on How AI-Generated Responses Affect SoftwareEngineering Surveys [3.183470571353323]
This study explores how large language models (LLMs) are being misused in software engineering surveys.<n>We analyzed data from two survey deployments conducted in 2025 through the Prolific platform.<n>We identify data authenticity as an emerging dimension of validity in software engineering surveys.
arXiv Detail & Related papers (2025-12-19T11:17:05Z)
A Systematic Literature Review on Detecting Software Vulnerabilities with Large Language Models [2.518519330408713]
Large Language Models (LLMs) in software engineering have sparked interest in their use for software vulnerability detection.<n>The rapid development of this field has resulted in a fragmented research landscape.<n>This fragmentation makes it difficult to obtain a clear overview of the state-of-the-art or compare and categorize studies meaningfully.
arXiv Detail & Related papers (2025-07-30T13:17:16Z)
Out-of-distribution detection in 3D applications: a review [1.188705980058767]
Object recognition methods assume that all object categories encountered during inference belong to a closed set of classes present in the training data.<n>This assumption limits generalization to the real world, as objects not seen during training may be misclassified or entirely ignored.<n>This paper provides a comprehensive overview of OOD detection within the broader scope of trustworthy and uncertain AI.
arXiv Detail & Related papers (2025-07-01T08:43:13Z)
From Tea Leaves to System Maps: Context-awareness in Monitoring Operational Machine Learning Models [10.17792666432021]
This paper presents a systematic review to characterize and structure the various types of contextual information in this domain.<n>We introduce the Contextual System--Aspect--Representation (C-SAR) framework, a conceptual model that synthesizes our findings.<n>We also identify 20 recurring and potentially reusable patterns of specific system, aspect, and representation triplets, and map them to the monitoring activities they support.
arXiv Detail & Related papers (2025-06-12T14:49:42Z)
Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions [0.017476232824732776]
Time-series anomaly detection plays an important role in engineering processes. This survey introduces a novel taxonomy where a distinction between online and offline, and training and inference is made. It presents the most popular data sets and evaluation metrics used in the literature, as well as a detailed analysis.
arXiv Detail & Related papers (2024-08-07T13:01:10Z)
Verification of Machine Unlearning is Fragile [48.71651033308842]
We introduce two novel adversarial unlearning processes capable of circumventing both types of verification strategies. This study highlights the vulnerabilities and limitations in machine unlearning verification, paving the way for further research into the safety of machine unlearning.
arXiv Detail & Related papers (2024-08-01T21:37:10Z)
A Survey of Defenses against AI-generated Visual Media: Detection, Disruption, and Authentication [15.879482578829489]
Deep generative models have demonstrated impressive performance in various computer vision applications. These models may be used for malicious purposes, such as misinformation, deception, and copyright violation. This paper provides a systematic and timely review of research efforts on defenses against AI-generated visual media.
arXiv Detail & Related papers (2024-07-15T09:46:02Z)
Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors [24.954755569786396]
AI-text detection has emerged to distinguish between human and machine-generated content. Recent research indicates that these detection systems often lack robustness and struggle to effectively differentiate perturbed texts. Our work simulates real-world scenarios in both informal and professional writing, exploring the out-of-the-box performance of current detectors.
arXiv Detail & Related papers (2024-06-13T08:37:01Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
Gone but Not Forgotten: Improved Benchmarks for Machine Unlearning [0.0]
We describe and propose alternative evaluation methods for machine unlearning algorithms. We show the utility of our alternative evaluations via a series of experiments of state-of-the-art unlearning algorithms on different computer vision datasets.
arXiv Detail & Related papers (2024-05-29T15:53:23Z)
Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors [57.7003399760813]
We explore advanced Large Language Models (LLMs) and their specialized variants, contributing to this field in several ways. We uncover a significant correlation between topics and detection performance. These investigations shed light on the adaptability and robustness of these detection methods across diverse topics.
arXiv Detail & Related papers (2023-12-20T10:53:53Z)
Managing the unknown: a survey on Open Set Recognition and tangential areas [7.345136916791223]
Open Set Recognition models are capable of detecting unknown classes from samples arriving during the testing phase, while maintaining a good level of performance in the classification of samples belonging to known classes. This review comprehensively overviews the recent literature related to Open Set Recognition, identifying common practices, limitations, and connections of this field with other machine learning research areas. Our work also uncovers open problems and suggests several research directions that may motivate and articulate future efforts towards more safe Artificial Intelligence methods.
arXiv Detail & Related papers (2023-12-14T10:08:12Z)
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets. We define a uniform evaluation setup including a new formalization of the annotation error detection task. We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z)
Poisoning Attacks and Defenses on Artificial Intelligence: A Survey [3.706481388415728]
Data poisoning attacks represent a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase. This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks. A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions.
arXiv Detail & Related papers (2022-02-21T14:43:38Z)
Human-in-the-Loop Disinformation Detection: Stance, Sentiment, or Something Else? [93.91375268580806]
Both politics and pandemics have recently provided ample motivation for the development of machine learning-enabled disinformation (a.k.a. fake news) detection algorithms. Existing literature has focused primarily on the fully-automated case, but the resulting techniques cannot reliably detect disinformation on the varied topics, sources, and time scales required for military applications. By leveraging an already-available analyst as a human-in-the-loop, canonical machine learning techniques of sentiment analysis, aspect-based sentiment analysis, and stance detection become plausible methods to use for a partially-automated disinformation detection system.
arXiv Detail & Related papers (2021-11-09T13:30:34Z)
Individual Explanations in Machine Learning Models: A Survey for Practitioners [69.02688684221265]
The use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise. Many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways. Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models.
arXiv Detail & Related papers (2021-04-09T01:46:34Z)
Survey of Network Intrusion Detection Methods from the Perspective of the Knowledge Discovery in Databases Process [63.75363908696257]
We review the methods that have been applied to network data with the purpose of developing an intrusion detector. We discuss the techniques used for the capture, preparation and transformation of the data, as well as, the data mining and evaluation methods. As a result of this literature review, we investigate some open issues which will need to be considered for further research in the area of network security.
arXiv Detail & Related papers (2020-01-27T11:21:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.