Deep Learning with Label Noise: A Hierarchical Approach
- URL: http://arxiv.org/abs/2205.14299v1
- Date: Sat, 28 May 2022 02:27:02 GMT
- Title: Deep Learning with Label Noise: A Hierarchical Approach
- Authors: Li Chen, Ningyuan Huang, Cong Mu, Hayden S. Helm, Kate Lytvynets,
Weiwei Yang, Carey E. Priebe
- Abstract summary: We propose a simple hierarchical approach that incorporates a label hierarchy when training the deep learning models.
Our approach requires no change of the network architecture or the optimization procedure.
Our hierarchical approach improves upon regular deep neural networks in learning with label noise.
- Score: 14.28389712842577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks are susceptible to label noise. Existing methods to
improve robustness, such as meta-learning and regularization, usually require
significant change to the network architecture or careful tuning of the
optimization procedure. In this work, we propose a simple hierarchical approach
that incorporates a label hierarchy when training the deep learning models. Our
approach requires no change of the network architecture or the optimization
procedure. We investigate our hierarchical network through a wide range of
simulated and real datasets and various label noise types. Our hierarchical
approach improves upon regular deep neural networks in learning with label
noise. Combining our hierarchical approach with pre-trained models achieves
state-of-the-art performance in real-world noisy datasets.
Related papers
- CaAdam: Improving Adam optimizer using connection aware methods [0.0]
We introduce a new method inspired by Adam that enhances convergence speed and achieves better loss function minima.
Traditional proxies, including Adam, apply uniform or globally adjusted learning rates across neural networks without considering their architectural specifics.
Our algorithm, CaAdam, explores this overlooked area by introducing connection-aware optimization through carefully designed of architectural information.
arXiv Detail & Related papers (2024-10-31T17:59:46Z) - Informed deep hierarchical classification: a non-standard analysis inspired approach [0.0]
It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer.
The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields.
To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks.
arXiv Detail & Related papers (2024-09-25T14:12:50Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Recurrent Neural Networks with Mixed Hierarchical Structures and EM
Algorithm for Natural Language Processing [9.645196221785694]
We develop an approach called the latent indicator layer to identify and learn implicit hierarchical information.
We also develop an EM algorithm to handle the latent indicator layer in training.
We show that the EM-HRNN model with bootstrap training outperforms other RNN-based models in document classification tasks.
arXiv Detail & Related papers (2022-01-21T23:08:33Z) - SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and
Residual Connections for Structure Preserving Object Classification [28.02302915971059]
In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task.
The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders.
To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
arXiv Detail & Related papers (2021-10-06T13:54:49Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - Ensemble Wrapper Subsampling for Deep Modulation Classification [70.91089216571035]
Subsampling of received wireless signals is important for relaxing hardware requirements as well as the computational cost of signal processing algorithms.
We propose a subsampling technique to facilitate the use of deep learning for automatic modulation classification in wireless communication systems.
arXiv Detail & Related papers (2020-05-10T06:11:13Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.