$ε$-Softmax: Approximating One-Hot Vectors for Mitigating Label Noise
- URL: http://arxiv.org/abs/2508.02387v1
- Date: Mon, 04 Aug 2025 13:10:48 GMT
- Title: $ε$-Softmax: Approximating One-Hot Vectors for Mitigating Label Noise
- Authors: Jialiang Wang, Xiong Zhou, Deming Zhai, Junjun Jiang, Xiangyang Ji, Xianming Liu,
- Abstract summary: Noisy labels pose a common challenge for training accurate deep neural networks.<n>We propose $epsilon$-softmax, which modifies the outputs of the softmax layer to approximate one-hot vectors with a controllable error.<n>We prove theoretically that $epsilon$-softmax can achieve noise-tolerant learning with controllable excess risk bound for almost any loss function.
- Score: 99.91399796174602
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Noisy labels pose a common challenge for training accurate deep neural networks. To mitigate label noise, prior studies have proposed various robust loss functions to achieve noise tolerance in the presence of label noise, particularly symmetric losses. However, they usually suffer from the underfitting issue due to the overly strict symmetric condition. In this work, we propose a simple yet effective approach for relaxing the symmetric condition, namely $\epsilon$-softmax, which simply modifies the outputs of the softmax layer to approximate one-hot vectors with a controllable error $\epsilon$. Essentially, $\epsilon$-softmax not only acts as an alternative for the softmax layer, but also implicitly plays the crucial role in modifying the loss function. We prove theoretically that $\epsilon$-softmax can achieve noise-tolerant learning with controllable excess risk bound for almost any loss function. Recognizing that $\epsilon$-softmax-enhanced losses may slightly reduce fitting ability on clean datasets, we further incorporate them with one symmetric loss, thereby achieving a better trade-off between robustness and effective learning. Extensive experiments demonstrate the superiority of our method in mitigating synthetic and real-world label noise. The code is available at https://github.com/cswjl/eps-softmax.
Related papers
- Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness [74.18546828528298]
We show that SignSGD with Majority Voting can robustly work on the whole range of complexity with $kappakappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappakappakappa-1right, kappa
arXiv Detail & Related papers (2025-02-11T19:54:11Z) - Dynamics-Aware Loss for Learning with Label Noise [73.75129479936302]
Label noise poses a serious threat to deep neural networks (DNNs)
We propose a dynamics-aware loss (DAL) to solve this problem.
Both the detailed theoretical analyses and extensive experimental results demonstrate the superiority of our method.
arXiv Detail & Related papers (2023-03-21T03:05:21Z) - Spectral Aware Softmax for Visible-Infrared Person Re-Identification [123.69049942659285]
Visible-infrared person re-identification (VI-ReID) aims to match specific pedestrian images from different modalities.
Existing methods still follow the softmax loss training paradigm, which is widely used in single-modality classification tasks.
We propose the spectral-aware softmax (SA-Softmax) loss, which can fully explore the embedding space with the modality information.
arXiv Detail & Related papers (2023-02-03T02:57:18Z) - Distinction Maximization Loss: Efficiently Improving Classification
Accuracy, Uncertainty Estimation, and Out-of-Distribution Detection Simply
Replacing the Loss and Calibrating [2.262407399039118]
We propose training deterministic deep neural networks using our DisMax loss.
DisMax usually outperforms all current approaches simultaneously in classification accuracy, uncertainty estimation, inference efficiency, and out-of-distribution detection.
arXiv Detail & Related papers (2022-05-12T04:37:35Z) - Learning with Noisy Labels via Sparse Regularization [76.31104997491695]
Learning with noisy labels is an important task for training accurate deep neural networks.
Some commonly-used loss functions, such as Cross Entropy (CE), suffer from severe overfitting to noisy labels.
We introduce the sparse regularization strategy to approximate the one-hot constraint.
arXiv Detail & Related papers (2021-07-31T09:40:23Z) - Asymmetric Loss Functions for Learning with Noisy Labels [82.50250230688388]
We propose a new class of loss functions, namely textitasymmetric loss functions, which are robust to learning with noisy labels for various types of noise.
Experimental results on benchmark datasets demonstrate that asymmetric loss functions can outperform state-of-the-art methods.
arXiv Detail & Related papers (2021-06-06T12:52:48Z) - Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
We investigate online convex optimization in non-stationary environments.
We choose the dynamic regret as the performance measure.
We show that it is possible to further enhance the dynamic regret by exploiting the smoothness condition.
arXiv Detail & Related papers (2020-07-07T14:10:57Z) - Normalized Loss Functions for Deep Learning with Noisy Labels [39.32101898670049]
We show that the commonly used Cross Entropy (CE) loss is not robust to noisy labels.
We propose a framework to build robust loss functions called Active Passive Loss (APL)
arXiv Detail & Related papers (2020-06-24T08:25:46Z) - Being Bayesian about Categorical Probability [6.875312133832079]
We consider a random variable of a categorical probability over class labels.
In this framework, the prior distribution explicitly models the presumed noise inherent in the observed label.
Our method can be implemented as a plug-and-play loss function with negligible computational overhead.
arXiv Detail & Related papers (2020-02-19T02:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.