Learning Adaptive Loss for Robust Learning with Noisy Labels
- URL: http://arxiv.org/abs/2002.06482v1
- Date: Sun, 16 Feb 2020 00:53:37 GMT
- Title: Learning Adaptive Loss for Robust Learning with Noisy Labels
- Authors: Jun Shu, Qian Zhao, Keyu Chen, Zongben Xu, Deyu Meng
- Abstract summary: Robust loss is an important strategy for handling robust learning issue.
We propose a meta-learning method capable of robust hyper tuning.
Four kinds of SOTA loss functions are attempted to be minimization, general availability and effectiveness.
- Score: 59.06189240645958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robust loss minimization is an important strategy for handling robust
learning issue on noisy labels. Current robust loss functions, however,
inevitably involve hyperparameter(s) to be tuned, manually or heuristically
through cross validation, which makes them fairly hard to be generally applied
in practice. Besides, the non-convexity brought by the loss as well as the
complicated network architecture makes it easily trapped into an unexpected
solution with poor generalization capability. To address above issues, we
propose a meta-learning method capable of adaptively learning hyperparameter in
robust loss functions. Specifically, through mutual amelioration between robust
loss hyperparameter and network parameters in our method, both of them can be
simultaneously finely learned and coordinated to attain solutions with good
generalization capability. Four kinds of SOTA robust loss functions are
attempted to be integrated into our algorithm, and comprehensive experiments
substantiate the general availability and effectiveness of the proposed method
in both its accuracy and generalization performance, as compared with
conventional hyperparameter tuning strategy, even with carefully tuned
hyperparameters.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Noise-Robust Loss Functions: Enhancing Bounded Losses for Large-Scale Noisy Data Learning [0.0]
Large annotated datasets inevitably contain noisy labels, which poses a major challenge for training deep neural networks as they easily memorize the labels.
Noise-robust loss functions have emerged as a notable strategy to counteract this issue, but it remains challenging to create a robust loss function which is not susceptible to underfitting.
We propose a novel method denoted as logit bias, which adds a real number $epsilon$ to the logit at the position of the correct class.
arXiv Detail & Related papers (2023-06-08T18:38:55Z) - Improve Noise Tolerance of Robust Loss via Noise-Awareness [60.34670515595074]
We propose a meta-learning method which is capable of adaptively learning a hyper parameter prediction function, called Noise-Aware-Robust-Loss-Adjuster (NARL-Adjuster for brevity)
Four SOTA robust loss functions are attempted to be integrated with our algorithm, and comprehensive experiments substantiate the general availability and effectiveness of the proposed method in both its noise tolerance and performance.
arXiv Detail & Related papers (2023-01-18T04:54:58Z) - Efficient Hyperparameter Tuning for Large Scale Kernel Ridge Regression [19.401624974011746]
We propose a complexity regularization criterion based on a data dependent penalty, and discuss its efficient optimization.
Our analysis shows the benefit of the proposed approach, that we hence incorporate in a library for large scale kernel methods to derive adaptively tuned solutions.
arXiv Detail & Related papers (2022-01-17T09:57:32Z) - Adaptive Hierarchical Similarity Metric Learning with Noisy Labels [138.41576366096137]
We propose an Adaptive Hierarchical Similarity Metric Learning method.
It considers two noise-insensitive information, textiti.e., class-wise divergence and sample-wise consistency.
Our method achieves state-of-the-art performance compared with current deep metric learning approaches.
arXiv Detail & Related papers (2021-10-29T02:12:18Z) - Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Minimax
Problems [80.46370778277186]
Large scale convex-concave minimax problems arise in numerous applications, including game theory, robust training, and training of generative adversarial networks.
We develop a communication-efficient distributed extragrad algorithm, LocalAdaSient, with an adaptive learning rate suitable for solving convex-concave minimax problem in the.
Server model.
We demonstrate its efficacy through several experiments in both the homogeneous and heterogeneous settings.
arXiv Detail & Related papers (2021-06-18T09:42:05Z) - Robust Learning via Persistency of Excitation [4.674053902991301]
We show that network training using gradient descent is equivalent to a dynamical system parameter estimation problem.
We provide an efficient technique for estimating the corresponding Lipschitz constant using extreme value theory.
Our approach also universally increases the adversarial accuracy by 0.1% to 0.3% points in various state-of-the-art adversarially trained models.
arXiv Detail & Related papers (2021-06-03T18:49:05Z) - Efficient Hyperparameter Tuning with Dynamic Accuracy Derivative-Free
Optimization [0.27074235008521236]
We apply a recent dynamic accuracy derivative-free optimization method to hyperparameter tuning.
This method allows inexact evaluations of the learning problem while retaining convergence guarantees.
We demonstrate its robustness and efficiency compared to a fixed accuracy approach.
arXiv Detail & Related papers (2020-11-06T00:59:51Z) - PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees [77.67258935234403]
We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning.
We develop a class of PAC-optimal meta-learning algorithms with performance guarantees and a principled meta-level regularization.
arXiv Detail & Related papers (2020-02-13T15:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.