Related papers: Utilizing Class Separation Distance for the Evaluation of Corruption Robustness of Machine Learning Classifiers

Utilizing Class Separation Distance for the Evaluation of Corruption Robustness of Machine Learning Classifiers

URL: http://arxiv.org/abs/2206.13405v1
Date: Mon, 27 Jun 2022 15:56:16 GMT
Title: Utilizing Class Separation Distance for the Evaluation of Corruption Robustness of Machine Learning Classifiers
Authors: Georg Siedel, Silvia Vock, Andrey Morozov, Stefan Vo{\ss}
Abstract summary: We propose a test data augmentation method that uses a robustness distance $epsilon$ derived from the datasets minimal class separation distance. The resulting MSCR metric allows a dataset-specific comparison of different classifiers with respect to their corruption robustness. Our results indicate that robustness training through simple data augmentation can already slightly improve accuracy.
Score: 0.6882042556551611
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Robustness is a fundamental pillar of Machine Learning (ML) classifiers, substantially determining their reliability. Methods for assessing classifier robustness are therefore essential. In this work, we address the challenge of evaluating corruption robustness in a way that allows comparability and interpretability on a given dataset. We propose a test data augmentation method that uses a robustness distance $\epsilon$ derived from the datasets minimal class separation distance. The resulting MSCR (mean statistical corruption robustness) metric allows a dataset-specific comparison of different classifiers with respect to their corruption robustness. The MSCR value is interpretable, as it represents the classifiers avoidable loss of accuracy due to statistical corruptions. On 2D and image data, we show that the metric reflects different levels of classifier robustness. Furthermore, we observe unexpected optima in classifiers robust accuracy through training and testing classifiers with different levels of noise. While researchers have frequently reported on a significant tradeoff on accuracy when training robust models, we strengthen the view that a tradeoff between accuracy and corruption robustness is not inherent. Our results indicate that robustness training through simple data augmentation can already slightly improve accuracy.

Related papers

Geometric Median Matching for Robust k-Subset Selection from Noisy Data [75.86423267723728]
We propose a novel k-subset selection strategy that leverages Geometric Median -- a robust estimator with an optimal breakdown point of 1/2. Our method iteratively selects a k-subset such that the mean of the subset approximates the GM of the (potentially) noisy dataset, ensuring robustness even under arbitrary corruption.
arXiv Detail & Related papers (2025-04-01T09:22:05Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions [3.1337872355726084]
This study investigates the use of random p-norm corruptions to augment the training and test data of image classifiers. We find that training data augmentation with a combination of p-norm corruptions significantly improves corruption robustness, even on top of state-of-the-art data augmentation schemes.
arXiv Detail & Related papers (2023-05-09T12:45:43Z)
Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset. We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z)
Confidence-aware Training of Smoothed Classifiers for Certified Robustness [75.95332266383417]
We use "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input. Our experiments show that the proposed method consistently exhibits improved certified robustness upon state-of-the-art training methods.
arXiv Detail & Related papers (2022-12-18T03:57:12Z)
Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature. We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance. By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders [11.300365160909879]
We present a probabilistic approach to classify strongly corrupted data and quantify uncertainty. A semi-supervised autoencoder trained on uncorrupted data is the underlying architecture. We show that the model uncertainty strongly depends on whether the classification is correct or wrong.
arXiv Detail & Related papers (2021-05-27T18:47:55Z)
Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions. We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples. We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
On the Role of Dataset Quality and Heterogeneity in Model Confidence [27.657631193015252]
Safety-critical applications require machine learning models that output accurate and calibrated probabilities. Uncalibrated deep networks are known to make over-confident predictions. We study the impact of dataset quality by studying the impact of dataset size and the label noise on the model confidence.
arXiv Detail & Related papers (2020-02-23T05:13:12Z)
Variational Encoder-based Reliable Classification [5.161531917413708]
We propose an Epistemic (EC) that can provide justification of its belief using support from the training dataset as well as quality of reconstruction. Our approach is based on modified variational autoencoders that can identify a semantically meaningful low-dimensional space. Our results demonstrate improved reliability of predictions and robust identification of samples with adversarial attacks.
arXiv Detail & Related papers (2020-02-19T17:05:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.