Related papers: Repairing Group-Level Errors for DNNs Using Weighted Regularization

Repairing Group-Level Errors for DNNs Using Weighted Regularization

URL: http://arxiv.org/abs/2203.13612v1
Date: Thu, 24 Mar 2022 15:45:23 GMT
Title: Repairing Group-Level Errors for DNNs Using Weighted Regularization
Authors: Ziyuan Zhong, Yuchi Tian, Conor J.Sweeney, Vicente Ordonez-Roman, Baishakhi Ray
Abstract summary: Deep Neural Networks (DNNs) have been widely used in software making decisions impacting people's lives. They have been found to exhibit severe erroneous behaviors that may lead to unfortunate outcomes. Previous work shows that such misbehaviors often occur due to class property violations rather than errors on a single image. Here, we propose a generic method called Weighted Regularization consisting of five concrete methods targeting the error-producing classes to fix the DNNs.
Score: 15.180437840817785
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNNs) have been widely used in software making decisions impacting people's lives. However, they have been found to exhibit severe erroneous behaviors that may lead to unfortunate outcomes. Previous work shows that such misbehaviors often occur due to class property violations rather than errors on a single image. Although methods for detecting such errors have been proposed, fixing them has not been studied so far. Here, we propose a generic method called Weighted Regularization (WR) consisting of five concrete methods targeting the error-producing classes to fix the DNNs. In particular, it can repair confusion error and bias error of DNN models for both single-label and multi-label image classifications. A confusion error happens when a given DNN model tends to confuse between two classes. Each method in WR assigns more weights at a stage of DNN retraining or inference to mitigate the confusion between target pair. A bias error can be fixed similarly. We evaluate and compare the proposed methods along with baselines on six widely-used datasets and architecture combinations. The results suggest that WR methods have different trade-offs but under each setting at least one WR method can greatly reduce confusion/bias errors at a very limited cost of the overall performance.

Related papers

Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We show that calibration error and refinement error are not minimized simultaneously during training. We introduce a new metric for early stopping and hyper parameter tuning that makes it possible to minimize refinement error during training. Our method integrates seamlessly with any architecture and consistently improves performance across diverse classification tasks.
arXiv Detail & Related papers (2025-01-31T15:03:54Z)
Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE) RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation. Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z)
Understanding and Mitigating Classification Errors Through Interpretable Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors. We propose to discover those patterns of tokens that distinguish correct and erroneous predictions. We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z)
Semi-Supervised Learning with Multiple Imputations on Non-Random Missing Labels [0.0]
Semi-Supervised Learning (SSL) is implemented when algorithms are trained on both labeled and unlabeled data. This paper proposes two new methods of combining multiple imputation models to achieve higher accuracy and less bias.
arXiv Detail & Related papers (2023-08-15T04:09:53Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
Robust Modeling of Unknown Dynamical Systems via Ensemble Averaged Learning [2.523610673302386]
Recent work has focused on data-driven learning of the evolution of unknown systems via deep neural networks (DNNs) This paper presents a computational technique which decreases the variance of the generalization error.
arXiv Detail & Related papers (2022-03-07T15:17:53Z)
Verification-Aided Deep Ensemble Selection [4.290931412096984]
Deep neural networks (DNNs) have become the technology of choice for realizing a variety of complex tasks. Even an imperceptible perturbation to a correctly classified input can lead to misclassification by a DNN. This paper devises a methodology for identifying ensemble compositions that are less prone to simultaneous errors.
arXiv Detail & Related papers (2022-02-08T14:36:29Z)
Optimization Variance: Exploring Generalization Properties of DNNs [83.78477167211315]
The test error of a deep neural network (DNN) often demonstrates double descent. We propose a novel metric, optimization variance (OV), to measure the diversity of model updates.
arXiv Detail & Related papers (2021-06-03T09:34:17Z)
A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data. In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem. We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z)
Generalized Negative Correlation Learning for Deep Ensembling [7.569288952340753]
Ensemble algorithms offer state of the art performance in many machine learning applications. We formulate a generalized bias-variance decomposition for arbitrary twice differentiable loss functions. We derive a Generalized Negative Correlation Learning algorithm which offers explicit control over the ensemble's diversity.
arXiv Detail & Related papers (2020-11-05T16:29:22Z)
One Versus all for deep Neural Network Incertitude (OVNNI) quantification [12.734278426543332]
We propose a new technique to quantify the epistemic uncertainty of data easily. This method consists in mixing the predictions of an ensemble of DNNs trained to classify One class vs All the other classes (OVA) with predictions from a standard DNN trained to perform All vs All (AVA) classification.
arXiv Detail & Related papers (2020-06-01T14:06:12Z)
DMT: Dynamic Mutual Training for Semi-Supervised Learning [69.17919491907296]
Self-training methods usually rely on single model prediction confidence to filter low-confidence pseudo labels. We propose mutual training between two different models by a dynamically re-weighted loss function, called Dynamic Mutual Training. Our experiments show that DMT achieves state-of-the-art performance in both image classification and semantic segmentation.
arXiv Detail & Related papers (2020-04-18T03:12:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.