Related papers: When and Why does a Model Fail? A Human-in-the-loop Error Detection Framework for Sentiment Analysis

When and Why does a Model Fail? A Human-in-the-loop Error Detection Framework for Sentiment Analysis

URL: http://arxiv.org/abs/2106.00954v1
Date: Wed, 2 Jun 2021 05:45:42 GMT
Title: When and Why does a Model Fail? A Human-in-the-loop Error Detection Framework for Sentiment Analysis
Authors: Zhe Liu, Yufan Guo, Jalal Mahmud
Abstract summary: We propose an error detection framework for sentiment analysis based on explainable features. Experimental results show that, given limited human-in-the-loop intervention, our method is able to identify erroneous model predictions on unseen data with high precision.
Score: 12.23497603132782
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Although deep neural networks have been widely employed and proven effective in sentiment analysis tasks, it remains challenging for model developers to assess their models for erroneous predictions that might exist prior to deployment. Once deployed, emergent errors can be hard to identify in prediction run-time and impossible to trace back to their sources. To address such gaps, in this paper we propose an error detection framework for sentiment analysis based on explainable features. We perform global-level feature validation with human-in-the-loop assessment, followed by an integration of global and local-level feature contribution analysis. Experimental results show that, given limited human-in-the-loop intervention, our method is able to identify erroneous model predictions on unseen data with high precision.

Related papers

Active operator learning with predictive uncertainty quantification for partial differential equations [6.519088943440059]
We develop a method for uncertainty quantification in deep operator networks (DeepONets) using predictive uncertainty estimates calibrated to model errors observed during training. The uncertainty framework operates using a single network, in contrast to existing ensemble approaches, and introduces minimal overhead during training and inference. We evaluate the uncertainty-equipped models on a series of partial differential equation (PDE) problems, and show that the model predictions are unbiased, non-skewed, and accurately reproduce solutions to the PDEs.
arXiv Detail & Related papers (2025-03-05T04:48:14Z)
Causal-discovery-based root-cause analysis and its application in time-series prediction error diagnosis [8.309366167066278]
Heuristic attribution methods, while helpful, often fail to capture true causal relationships, leading to inaccurate error attributions. We introduce the Causal-Discovery-based Root-Cause Analysis (CD-RCA) method that estimates causal relationships between the prediction error and the explanatory variables. By simulating synthetic error data, CD-RCA can identify variable contributions to outliers in prediction errors by Shapley values.
arXiv Detail & Related papers (2024-11-11T13:48:13Z)
Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance. Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z)
An Ambiguity Measure for Recognizing the Unknowns in Deep Learning [0.0]
We study the understanding of deep neural networks from the scope in which they are trained on. We propose a measure for quantifying the ambiguity of inputs for any given model.
arXiv Detail & Related papers (2023-12-11T02:57:12Z)
PAGER: A Framework for Failure Analysis of Deep Regression Models [27.80057763697904]
We introduce PAGER (Principled Analysis of Generalization Errors in Regressors), a framework to systematically detect and characterize failures in deep regressors. Built upon the principle of anchored training in deep models, PAGER unifies both epistemic uncertainty and complementary manifold non-conformity scores to accurately organize samples into different risk regimes.
arXiv Detail & Related papers (2023-09-20T00:37:35Z)
Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks. Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z)
Generalizability Analysis of Graph-based Trajectory Predictor with Vectorized Representation [29.623692599892365]
Trajectory prediction is one of the essential tasks for autonomous vehicles. Recent progress in machine learning gave birth to a series of advanced trajectory prediction algorithms.
arXiv Detail & Related papers (2022-08-06T20:19:52Z)
Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks. This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network. Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z)
Predicting Unreliable Predictions by Shattering a Neural Network [145.3823991041987]
Piecewise linear neural networks can be split into subfunctions. Subfunctions have their own activation pattern, domain, and empirical error. Empirical error for the full network can be written as an expectation over subfunctions.
arXiv Detail & Related papers (2021-06-15T18:34:41Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Estimating Generalization under Distribution Shifts via Domain-Invariant Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels. The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z)
A comprehensive study on the prediction reliability of graph neural networks for virtual screening [0.0]
We investigate the effects of model architectures, regularization methods, and loss functions on the prediction performance and reliability of classification results. Our result highlights that correct choice of regularization and inference methods is evidently important to achieve high success rate.
arXiv Detail & Related papers (2020-03-17T10:13:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.