Fighting over-fitting with quantization for learning deep neural
networks on noisy labels
- URL: http://arxiv.org/abs/2303.11803v1
- Date: Tue, 21 Mar 2023 12:36:58 GMT
- Title: Fighting over-fitting with quantization for learning deep neural
networks on noisy labels
- Authors: Gauthier Tallec, Edouard Yvinec, Arnaud Dapogny, Kevin Bailly
- Abstract summary: We study the ability of compression methods to tackle both of these problems at once.
We hypothesize that quantization-aware training, by restricting the expressivity of neural networks, behaves as a regularization.
- Score: 7.09232719022402
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rising performance of deep neural networks is often empirically
attributed to an increase in the available computational power, which allows
complex models to be trained upon large amounts of annotated data. However,
increased model complexity leads to costly deployment of modern neural
networks, while gathering such amounts of data requires huge costs to avoid
label noise. In this work, we study the ability of compression methods to
tackle both of these problems at once. We hypothesize that quantization-aware
training, by restricting the expressivity of neural networks, behaves as a
regularization. Thus, it may help fighting overfitting on noisy data while also
allowing for the compression of the model at inference. We first validate this
claim on a controlled test with manually introduced label noise. Furthermore,
we also test the proposed method on Facial Action Unit detection, where labels
are typically noisy due to the subtlety of the task. In all cases, our results
suggests that quantization significantly improve the results compared with
existing baselines, regularization as well as other compression methods.
Related papers
- Learning with Noisy labels via Self-supervised Adversarial Noisy Masking [33.87292143223425]
We propose a novel training approach termed adversarial noisy masking.
It adaptively modulates the input data and label simultaneously, preventing the model to overfit noisy samples.
It is tested on both synthetic and real-world noisy datasets.
arXiv Detail & Related papers (2023-02-14T03:13:26Z) - Weak-signal extraction enabled by deep-neural-network denoising of
diffraction data [26.36525764239897]
We show how data can be denoised via a deep convolutional neural network.
We demonstrate that weak signals stemming from charge ordering, insignificant in the noisy data, become visible and accurate in the denoised data.
arXiv Detail & Related papers (2022-09-19T14:43:01Z) - Dissecting U-net for Seismic Application: An In-Depth Study on Deep
Learning Multiple Removal [3.058685580689605]
Seismic processing often requires suppressing multiples that appear when collecting data.
We present a deep learning-based alternative that provides competitive results, while reducing its usage's complexity.
arXiv Detail & Related papers (2022-06-24T07:16:27Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - Robust Training under Label Noise by Over-parameterization [41.03008228953627]
We propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted.
The main idea is yet very simple: label noise is sparse and incoherent with the network learned from clean data, so we model the noise and learn to separate it from the data.
Remarkably, when trained using such a simple method in practice, we demonstrate state-of-the-art test accuracy against label noise on a variety of real datasets.
arXiv Detail & Related papers (2022-02-28T18:50:10Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Data-free mixed-precision quantization using novel sensitivity metric [6.031526641614695]
We propose a novel sensitivity metric that considers the effect of quantization error on task loss and interaction with other layers.
Our experiments show that the proposed metric better represents quantization sensitivity, and generated data are more feasible to be applied to mixed-precision quantization.
arXiv Detail & Related papers (2021-03-18T07:23:21Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets.
However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality.
We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z) - Beyond Dropout: Feature Map Distortion to Regularize Deep Neural
Networks [107.77595511218429]
In this paper, we investigate the empirical Rademacher complexity related to intermediate layers of deep neural networks.
We propose a feature distortion method (Disout) for addressing the aforementioned problem.
The superiority of the proposed feature map distortion for producing deep neural network with higher testing performance is analyzed and demonstrated.
arXiv Detail & Related papers (2020-02-23T13:59:13Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.