Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences
- URL: http://arxiv.org/abs/2402.00501v2
- Date: Wed, 23 Oct 2024 20:55:31 GMT
- Title: Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences
- Authors: Francisco Daunas, IƱaki Esnaola, Samir M. Perlaza, H. Vincent Poor,
- Abstract summary: The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented.
Examples of the solution for particular choices of the function $f$ are presented.
- Score: 45.935798913942904
- License:
- Abstract: The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented under mild conditions on $f$. Under such conditions, the optimal measure is shown to be unique. Examples of the solution for particular choices of the function $f$ are presented. Previously known solutions to common regularization choices are obtained by leveraging the flexibility of the family of $f$-divergences. These include the unique solutions to empirical risk minimization with relative entropy regularization (Type-I and Type-II). The analysis of the solution unveils the following properties of $f$-divergences when used in the ERM-$f$DR problem: $i\bigl)$ $f$-divergence regularization forces the support of the solution to coincide with the support of the reference measure, which introduces a strong inductive bias that dominates the evidence provided by the training data; and $ii\bigl)$ any $f$-divergence regularization is equivalent to a different $f$-divergence regularization with an appropriate transformation of the empirical risk function.
Related papers
- Sparsity via Sparse Group $k$-max Regularization [22.05774771336432]
In this paper, we propose a novel and concise regularization, namely the sparse group $k$-max regularization.
We verify the effectiveness and flexibility of the proposed method through numerical experiments on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-02-13T14:41:28Z) - Variance Reduced Halpern Iteration for Finite-Sum Monotone Inclusions [18.086061048484616]
We study finite-sum monotone inclusion problems, which model broad classes of equilibrium problems.
Our main contributions are variants of the classical Halpern iteration that employ variance reduction to obtain improved complexity guarantees.
We argue that, up to poly-logarithmic factors, this complexity is unimprovable in the monotone Lipschitz setting.
arXiv Detail & Related papers (2023-10-04T17:24:45Z) - Analysis of the Relative Entropy Asymmetry in the Regularization of
Empirical Risk Minimization [70.540936204654]
The effect of the relative entropy asymmetry is analyzed in the empirical risk minimization with relative entropy regularization (ERM-RER) problem.
A novel regularization is introduced, coined Type-II regularization, that allows for solutions to the ERM-RER problem with a support that extends outside the support of the reference measure.
arXiv Detail & Related papers (2023-06-12T13:56:28Z) - A Robustness Analysis of Blind Source Separation [91.3755431537592]
Blind source separation (BSS) aims to recover an unobserved signal from its mixture $X=f(S)$ under the condition that the transformation $f$ is invertible but unknown.
We present a general framework for analysing such violations and quantifying their impact on the blind recovery of $S$ from $X$.
We show that a generic BSS-solution in response to general deviations from its defining structural assumptions can be profitably analysed in the form of explicit continuity guarantees.
arXiv Detail & Related papers (2023-03-17T16:30:51Z) - Empirical Risk Minimization with Relative Entropy Regularization [6.815730801645783]
The empirical risk minimization (ERM) problem with relative entropy regularization (ERM-RER) is investigated.
The solution to this problem, if it exists, is shown to be a unique probability measure, mutually absolutely continuous with the reference measure.
For a fixed dataset and under a specific condition, the empirical risk is shown to be a sub-Gaussian random variable.
arXiv Detail & Related papers (2022-11-12T09:41:02Z) - A Primal-Dual Approach to Solving Variational Inequalities with General Constraints [54.62996442406718]
Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities.
We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(frac1sqrtK)$ when the operator is $L$-Lipschitz and monotone.
arXiv Detail & Related papers (2022-10-27T17:59:09Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.