Related papers: A Bayesian Approach to Identifying Representational Errors

A Bayesian Approach to Identifying Representational Errors

URL: http://arxiv.org/abs/2103.15171v1
Date: Sun, 28 Mar 2021 16:43:01 GMT
Title: A Bayesian Approach to Identifying Representational Errors
Authors: Ramya Ramakrishnan, Vaibhav Unhelkar, Ece Kamar, Julie Shah
Abstract summary: We present a generative model for inferring representational errors based on observations of an actor's behavior. We show that our approach can recover blind spots of both reinforcement learning agents as well as human users.
Score: 19.539720986687524
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Trained AI systems and expert decision makers can make errors that are often difficult to identify and understand. Determining the root cause for these errors can improve future decisions. This work presents Generative Error Model (GEM), a generative model for inferring representational errors based on observations of an actor's behavior (either simulated agent, robot, or human). The model considers two sources of error: those that occur due to representational limitations -- "blind spots" -- and non-representational errors, such as those caused by noise in execution or systematic errors present in the actor's policy. Disambiguating these two error types allows for targeted refinement of the actor's policy (i.e., representational errors require perceptual augmentation, while other errors can be reduced through methods such as improved training or attention support). We present a Bayesian inference algorithm for GEM and evaluate its utility in recovering representational errors on multiple domains. Results show that our approach can recover blind spots of both reinforcement learning agents as well as human users.

Related papers

Technical Report for Egocentric Mistake Detection for the HoloAssist Challenge [5.257305312436567]
We introduce an online mistake detection framework that handles both procedural and execution errors.<n>Upon detecting an error, we use a large language model (LLM) to generate explanatory feedback.<n>Experiments on the HoloAssist benchmark confirm the effectiveness of our approach.
arXiv Detail & Related papers (2025-06-06T15:39:09Z)
Classification Error Bound for Low Bayes Error Conditions in Machine Learning [50.25063912757367]
We study the relationship between the error mismatch and the Kullback-Leibler divergence in machine learning. Motivated by recent observations of low model-based classification errors in many machine learning tasks, we propose a linear approximation of the classification error bound for low Bayes error conditions.
arXiv Detail & Related papers (2025-01-27T11:57:21Z)
Automatic Discovery and Assessment of Interpretable Systematic Errors in Semantic Segmentation [0.5242869847419834]
This paper presents a novel method for discovering systematic errors in segmentation models. We leverage multimodal foundation models to retrieve errors and use conceptual linkage along with erroneous nature to study the systematic nature of these errors. Our work opens up the avenue to model analysis and intervention that have so far been underexplored in semantic segmentation.
arXiv Detail & Related papers (2024-11-16T17:31:37Z)
Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE) RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation. Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z)
Unveiling AI's Blind Spots: An Oracle for In-Domain, Out-of-Domain, and Adversarial Errors [4.525077884001726]
We conduct empirical evaluations using a "mentor" model-a deep neural network designed to predict another model's errors. We develop an "oracle" mentor model, dubbed SuperMentor, that achieves 78% accuracy in predicting errors across different error types.
arXiv Detail & Related papers (2024-10-03T11:02:39Z)
A Coin Has Two Sides: A Novel Detector-Corrector Framework for Chinese Spelling Correction [79.52464132360618]
Chinese Spelling Correction (CSC) stands as a foundational Natural Language Processing (NLP) task. We introduce a novel approach based on error detector-corrector framework. Our detector is designed to yield two error detection results, each characterized by high precision and recall.
arXiv Detail & Related papers (2024-09-06T09:26:45Z)
Understanding and Mitigating Classification Errors Through Interpretable Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors. We propose to discover those patterns of tokens that distinguish correct and erroneous predictions. We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z)
Accountable Error Characterization [7.830479195591646]
We propose an accountable error characterization method, AEC, to understand when and where errors occur. We perform error detection for a sentiment analysis task using AEC as a case study.
arXiv Detail & Related papers (2021-05-10T23:40:01Z)
Neural Text Generation with Artificial Negative Examples [7.187858820534111]
We propose to suppress an arbitrary type of errors by training the text generation model in a reinforcement learning framework. We use a trainable reward function that is capable of discriminating between references and sentences containing the targeted type of errors. The experimental results show that our method can suppress the generation errors and achieve significant improvements on two machine translation and two image captioning tasks.
arXiv Detail & Related papers (2020-12-28T07:25:10Z)
Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z)
Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle. In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize. Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
Estimating Generalization under Distribution Shifts via Domain-Invariant Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels. The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z)
A Unified Weight Learning and Low-Rank Regression Model for Robust Complex Error Modeling [12.287346997617542]
One of the most important problems in regression-based error model is modeling the complex representation error caused by various corruptions environment changes in images. In this paper, we propose a unified weight learning and low-rank approximation regression model, which enables the random noises contiguous occlusions in images to be treated simultaneously.
arXiv Detail & Related papers (2020-05-10T09:50:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.