Batch Inverse-Variance Weighting: Deep Heteroscedastic Regression
- URL: http://arxiv.org/abs/2107.04497v1
- Date: Fri, 9 Jul 2021 15:39:31 GMT
- Title: Batch Inverse-Variance Weighting: Deep Heteroscedastic Regression
- Authors: Vincent Mai, Waleed Khamies, Liam Paull
- Abstract summary: We introduce Batch Inverse-Variance, a loss function which is robust to near-ground truth samples, and allows to control the effective learning rate.
Our experimental results show that BIV improves significantly the performance of the networks on two noisy datasets.
- Score: 12.415463205960156
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Heteroscedastic regression is the task of supervised learning where each
label is subject to noise from a different distribution. This noise can be
caused by the labelling process, and impacts negatively the performance of the
learning algorithm as it violates the i.i.d. assumptions. In many situations
however, the labelling process is able to estimate the variance of such
distribution for each label, which can be used as an additional information to
mitigate this impact. We adapt an inverse-variance weighted mean square error,
based on the Gauss-Markov theorem, for parameter optimization on neural
networks. We introduce Batch Inverse-Variance, a loss function which is robust
to near-ground truth samples, and allows to control the effective learning
rate. Our experimental results show that BIV improves significantly the
performance of the networks on two noisy datasets, compared to L2 loss,
inverse-variance weighting, as well as a filtering-based baseline.
Related papers
- Improving Distribution Alignment with Diversity-based Sampling [0.0]
Domain shifts are ubiquitous in machine learning, and can substantially degrade a model's performance when deployed to real-world data.
This paper proposes to improve these estimates by inducing diversity in each sampled minibatch.
It simultaneously balances the data and reduces the variance of the gradients, thereby enhancing the model's generalisation ability.
arXiv Detail & Related papers (2024-10-05T17:26:03Z) - Doubly Stochastic Models: Learning with Unbiased Label Noises and
Inference Stability [85.1044381834036]
We investigate the implicit regularization effects of label noises under mini-batch sampling settings of gradient descent.
We find such implicit regularizer would favor some convergence points that could stabilize model outputs against perturbation of parameters.
Our work doesn't assume SGD as an Ornstein-Uhlenbeck like process and achieve a more general result with convergence of approximation proved.
arXiv Detail & Related papers (2023-04-01T14:09:07Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - On Robust Learning from Noisy Labels: A Permutation Layer Approach [53.798757734297986]
This paper introduces a permutation layer learning approach termed PermLL to dynamically calibrate the training process of a deep neural network (DNN)
We provide two variants of PermLL in this paper: one applies the permutation layer to the model's prediction, while the other applies it directly to the given noisy label.
We validate PermLL experimentally and show that it achieves state-of-the-art performance on both real and synthetic datasets.
arXiv Detail & Related papers (2022-11-29T03:01:48Z) - Noise-Robust Bidirectional Learning with Dynamic Sample Reweighting [28.493837430606117]
Deep neural networks trained with standard cross-entropy loss are more prone to noisy labels.
Negative learning using complementary labels is more robust when noisy labels intervene but with an extremely slow model convergence speed.
In this paper, we first introduce a bidirectional learning scheme, where positive learning ensures convergence speed while negative learning robustly copes with label noise.
arXiv Detail & Related papers (2022-09-03T06:00:31Z) - Do We Need to Penalize Variance of Losses for Learning with Label Noise? [91.38888889609002]
We find that the variance should be increased for the problem of learning with noisy labels.
By exploiting the label noise transition matrix, regularizers can be easily designed to reduce the variance of losses.
Empirically, the proposed method by increasing the variance of losses significantly improves the generalization ability of baselines on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-01-30T06:19:08Z) - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation [12.415463205960156]
In model-free deep reinforcement learning (RL) algorithms, using noisy value estimates to supervise policy evaluation and optimization is detrimental to the sample efficiency.
We provide a systematic analysis of the sources of uncertainty in the noisy supervision that occurs in RL.
We propose a method whereby two complementary uncertainty estimation methods account for both the Q-value and the environmentity to better mitigate the negative impacts of noisy supervision.
arXiv Detail & Related papers (2022-01-05T15:46:06Z) - Learning Noise Transition Matrix from Only Noisy Labels via Total
Variation Regularization [88.91872713134342]
We propose a theoretically grounded method that can estimate the noise transition matrix and learn a classifier simultaneously.
We show the effectiveness of the proposed method through experiments on benchmark and real-world datasets.
arXiv Detail & Related papers (2021-02-04T05:09:18Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.