A Symmetric Loss Perspective of Reliable Machine Learning
- URL: http://arxiv.org/abs/2101.01366v2
- Date: Mon, 5 Jun 2023 23:49:35 GMT
- Title: A Symmetric Loss Perspective of Reliable Machine Learning
- Authors: Nontawat Charoenphakdee, Jongyeong Lee, Masashi Sugiyama
- Abstract summary: We review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization.
We demonstrate how the robust AUC method can benefit natural language processing in the problem where we want to learn only from relevant keywords.
- Score: 87.68601212686086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When minimizing the empirical risk in binary classification, it is a common
practice to replace the zero-one loss with a surrogate loss to make the
learning objective feasible to optimize. Examples of well-known surrogate
losses for binary classification include the logistic loss, hinge loss, and
sigmoid loss. It is known that the choice of a surrogate loss can highly
influence the performance of the trained classifier and therefore it should be
carefully chosen. Recently, surrogate losses that satisfy a certain symmetric
condition (aka., symmetric losses) have demonstrated their usefulness in
learning from corrupted labels. In this article, we provide an overview of
symmetric losses and their applications. First, we review how a symmetric loss
can yield robust classification from corrupted labels in balanced error rate
(BER) minimization and area under the receiver operating characteristic curve
(AUC) maximization. Then, we demonstrate how the robust AUC maximization method
can benefit natural language processing in the problem where we want to learn
only from relevant keywords and unlabeled documents. Finally, we conclude this
article by discussing future directions, including potential applications of
symmetric losses for reliable machine learning and the design of non-symmetric
losses that can benefit from the symmetric condition.
Related papers
- EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification [1.3778851745408134]
We propose a novel ensemble method, namely EnsLoss, to combine loss functions within the Empirical risk minimization framework.
We first transform the CC conditions of losses into loss-derivatives, thereby bypassing the need for explicit loss functions.
We theoretically establish the statistical consistency of our approach and provide insights into its benefits.
arXiv Detail & Related papers (2024-09-02T02:40:42Z) - Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance.
symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted.
Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z) - Symmetric Neural-Collapse Representations with Supervised Contrastive
Loss: The Impact of ReLU and Batching [26.994954303270575]
Supervised contrastive loss (SCL) is a competitive and often superior alternative to the cross-entropy loss for classification.
While prior studies have demonstrated that both losses yield symmetric training representations under balanced data, this symmetry breaks under class imbalances.
This paper presents an intriguing discovery: the introduction of a ReLU activation at the final layer effectively restores the symmetry in SCL-learned representations.
arXiv Detail & Related papers (2023-06-13T17:55:39Z) - Loss Minimization through the Lens of Outcome Indistinguishability [11.709566373491619]
We present a new perspective on convex loss and the recent notion of Omniprediction.
By design, Loss OI implies omniprediction in a direct and intuitive manner.
We show that Loss OI for the important set of losses arising from Generalized Models, without requiring full multicalibration.
arXiv Detail & Related papers (2022-10-16T22:25:27Z) - Constrained Classification and Policy Learning [0.0]
We study consistency of surrogate loss procedures under a constrained set of classifiers.
We show that hinge losses are the only surrogate losses that preserve consistency in second-best scenarios.
arXiv Detail & Related papers (2021-06-24T10:43:00Z) - Leveraged Weighted Loss for Partial Label Learning [64.85763991485652]
Partial label learning deals with data where each instance is assigned with a set of candidate labels, whereas only one of them is true.
Despite many methodology studies on learning from partial labels, there still lacks theoretical understandings of their risk consistent properties.
We propose a family of loss functions named textitd weighted (LW) loss, which for the first time introduces the leverage parameter $beta$ to consider the trade-off between losses on partial labels and non-partial ones.
arXiv Detail & Related papers (2021-06-10T13:25:13Z) - Asymmetric Loss Functions for Learning with Noisy Labels [82.50250230688388]
We propose a new class of loss functions, namely textitasymmetric loss functions, which are robust to learning with noisy labels for various types of noise.
Experimental results on benchmark datasets demonstrate that asymmetric loss functions can outperform state-of-the-art methods.
arXiv Detail & Related papers (2021-06-06T12:52:48Z) - Lower-bounded proper losses for weakly supervised classification [73.974163801142]
We discuss the problem of weakly supervised learning of classification, in which instances are given weak labels.
We derive a representation theorem for proper losses in supervised learning, which dualizes the Savage representation.
We experimentally demonstrate the effectiveness of our proposed approach, as compared to improper or unbounded losses.
arXiv Detail & Related papers (2021-03-04T08:47:07Z) - Calibrated Surrogate Losses for Adversarially Robust Classification [92.37268323142307]
We show that no convex surrogate loss is respect with respect to adversarial 0-1 loss when restricted to linear models.
We also show that if the underlying distribution satisfies the Massart's noise condition, convex losses can also be calibrated in the adversarial setting.
arXiv Detail & Related papers (2020-05-28T02:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.