A Framework of Learning Through Empirical Gain Maximization
- URL: http://arxiv.org/abs/2009.14250v2
- Date: Tue, 12 Jan 2021 03:07:01 GMT
- Title: A Framework of Learning Through Empirical Gain Maximization
- Authors: Yunlong Feng and Qiang Wu
- Abstract summary: We develop a framework of empirical gain (EGM) to address the robust regression problem.
The Tukey's biweight loss can be derived from other triunderstood non-established loss functions.
- Score: 8.834480010537229
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop in this paper a framework of empirical gain maximization (EGM) to
address the robust regression problem where heavy-tailed noise or outliers may
present in the response variable. The idea of EGM is to approximate the density
function of the noise distribution instead of approximating the truth function
directly as usual. Unlike the classical maximum likelihood estimation that
encourages equal importance of all observations and could be problematic in the
presence of abnormal observations, EGM schemes can be interpreted from a
minimum distance estimation viewpoint and allow the ignorance of those
observations. Furthermore, it is shown that several well-known robust nonconvex
regression paradigms, such as Tukey regression and truncated least square
regression, can be reformulated into this new framework. We then develop a
learning theory for EGM, by means of which a unified analysis can be conducted
for these well-established but not fully-understood regression approaches.
Resulting from the new framework, a novel interpretation of existing bounded
nonconvex loss functions can be concluded. Within this new framework, the two
seemingly irrelevant terminologies, the well-known Tukey's biweight loss for
robust regression and the triweight kernel for nonparametric smoothing, are
closely related. More precisely, it is shown that the Tukey's biweight loss can
be derived from the triweight kernel. Similarly, other frequently employed
bounded nonconvex loss functions in machine learning such as the truncated
square loss, the Geman-McClure loss, and the exponential squared loss can also
be reformulated from certain smoothing kernels in statistics. In addition, the
new framework enables us to devise new bounded nonconvex loss functions for
robust learning.
Related papers
- LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization [56.67706781191521]
An adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown to the learner.
We present a robust online rounds optimization framework, where an adversary can introduce outliers by corrupting loss functions in an arbitrary number of k, unknown.
arXiv Detail & Related papers (2024-08-12T17:08:31Z) - High-probability minimax lower bounds [2.5993680263955947]
We introduce the notion of a minimax quantile, and seek to articulate its dependence on the quantile level.
We develop high-probability variants of the classical Le Cam and Fano methods, as well as a technique to convert local minimax risk lower bounds to lower bounds on minimax quantiles.
arXiv Detail & Related papers (2024-06-19T11:15:01Z) - Robust deep learning from weakly dependent data [0.0]
This paper considers robust deep learning from weakly dependent observations, with unbounded loss function and unbounded input/output.
We derive a relationship between these bounds and $r$, and when the data have moments of any order (that is $r=infty$), the convergence rate is close to some well-known results.
arXiv Detail & Related papers (2024-05-08T14:25:40Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Theoretical Characterization of the Generalization Performance of
Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features.
We find new and interesting properties that do not exist in single-task linear regression.
Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z) - The Implicit Bias of Benign Overfitting [31.714928102950584]
benign overfitting is where a predictor perfectly fits noisy training data while attaining near-optimal expected loss.
We show how this can be extended beyond standard linear regression.
We then turn to classification problems, and show that the situation there is much more favorable.
arXiv Detail & Related papers (2022-01-27T12:49:21Z) - On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes.
We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z) - Interpolation can hurt robust generalization even when there is no noise [76.3492338989419]
We show that avoiding generalization through ridge regularization can significantly improve generalization even in the absence of noise.
We prove this phenomenon for the robust risk of both linear regression and classification and hence provide the first theoretical result on robust overfitting.
arXiv Detail & Related papers (2021-08-05T23:04:15Z) - Bayesian Uncertainty Estimation of Learned Variational MRI
Reconstruction [63.202627467245584]
We introduce a Bayesian variational framework to quantify the model-immanent (epistemic) uncertainty.
We demonstrate that our approach yields competitive results for undersampled MRI reconstruction.
arXiv Detail & Related papers (2021-02-12T18:08:14Z) - Recovering Joint Probability of Discrete Random Variables from Pairwise
Marginals [22.77704627076251]
Learning the joint probability of random variables (RVs) is the cornerstone of statistical signal processing and machine learning.
Recent work has proposed to recover the joint probability mass function (PMF) of an arbitrary number of RVs from three-dimensional marginals.
accurately estimating three-dimensional marginals can still be costly in terms of sample complexity.
This work puts forth a new framework for learning the joint PMF using only pairwise marginals.
arXiv Detail & Related papers (2020-06-30T15:43:44Z) - New Insights into Learning with Correntropy Based Regression [3.066157114715031]
We show that correntropy based regression regresses towards conditional mode function or the conditional mean function robustly under certain conditions.
We also present some new results when it is utilized to learn the conditional mean function.
arXiv Detail & Related papers (2020-06-19T21:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.