When and Why does a Model Fail? A Human-in-the-loop Error Detection
Framework for Sentiment Analysis
- URL: http://arxiv.org/abs/2106.00954v1
- Date: Wed, 2 Jun 2021 05:45:42 GMT
- Title: When and Why does a Model Fail? A Human-in-the-loop Error Detection
Framework for Sentiment Analysis
- Authors: Zhe Liu, Yufan Guo, Jalal Mahmud
- Abstract summary: We propose an error detection framework for sentiment analysis based on explainable features.
Experimental results show that, given limited human-in-the-loop intervention, our method is able to identify erroneous model predictions on unseen data with high precision.
- Score: 12.23497603132782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although deep neural networks have been widely employed and proven effective
in sentiment analysis tasks, it remains challenging for model developers to
assess their models for erroneous predictions that might exist prior to
deployment. Once deployed, emergent errors can be hard to identify in
prediction run-time and impossible to trace back to their sources. To address
such gaps, in this paper we propose an error detection framework for sentiment
analysis based on explainable features. We perform global-level feature
validation with human-in-the-loop assessment, followed by an integration of
global and local-level feature contribution analysis. Experimental results show
that, given limited human-in-the-loop intervention, our method is able to
identify erroneous model predictions on unseen data with high precision.
Related papers
- Causal-discovery-based root-cause analysis and its application in time-series prediction error diagnosis [8.309366167066278]
Heuristic attribution methods, while helpful, often fail to capture true causal relationships, leading to inaccurate error attributions.
We introduce the Causal-Discovery-based Root-Cause Analysis (CD-RCA) method that estimates causal relationships between the prediction error and the explanatory variables.
By simulating synthetic error data, CD-RCA can identify variable contributions to outliers in prediction errors by Shapley values.
arXiv Detail & Related papers (2024-11-11T13:48:13Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - An Ambiguity Measure for Recognizing the Unknowns in Deep Learning [0.0]
We study the understanding of deep neural networks from the scope in which they are trained on.
We propose a measure for quantifying the ambiguity of inputs for any given model.
arXiv Detail & Related papers (2023-12-11T02:57:12Z) - PAGER: A Framework for Failure Analysis of Deep Regression Models [27.80057763697904]
We introduce PAGER (Principled Analysis of Generalization Errors in Regressors), a framework to systematically detect and characterize failures in deep regressors.
Built upon the principle of anchored training in deep models, PAGER unifies both epistemic uncertainty and complementary manifold non-conformity scores to accurately organize samples into different risk regimes.
arXiv Detail & Related papers (2023-09-20T00:37:35Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Generalizability Analysis of Graph-based Trajectory Predictor with
Vectorized Representation [29.623692599892365]
Trajectory prediction is one of the essential tasks for autonomous vehicles.
Recent progress in machine learning gave birth to a series of advanced trajectory prediction algorithms.
arXiv Detail & Related papers (2022-08-06T20:19:52Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Predicting Unreliable Predictions by Shattering a Neural Network [145.3823991041987]
Piecewise linear neural networks can be split into subfunctions.
Subfunctions have their own activation pattern, domain, and empirical error.
Empirical error for the full network can be written as an expectation over subfunctions.
arXiv Detail & Related papers (2021-06-15T18:34:41Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z) - A comprehensive study on the prediction reliability of graph neural
networks for virtual screening [0.0]
We investigate the effects of model architectures, regularization methods, and loss functions on the prediction performance and reliability of classification results.
Our result highlights that correct choice of regularization and inference methods is evidently important to achieve high success rate.
arXiv Detail & Related papers (2020-03-17T10:13:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.