Related papers: Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences

Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences

URL: http://arxiv.org/abs/2402.00501v1
Date: Thu, 1 Feb 2024 11:12:00 GMT
Title: Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences
Authors: Francisco Daunas, I\~naki Esnaola, Samir M. Perlaza, H. Vincent Poor
Abstract summary: The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented. Examples of the solution for particular choices of the function $f$ are presented.
Score: 49.853843995972085
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented under mild conditions on $f$. Under such conditions, the optimal measure is shown to be unique. Examples of the solution for particular choices of the function $f$ are presented. Previously known solutions to common regularization choices are obtained by leveraging the flexibility of the family of $f$-divergences. These include the unique solutions to empirical risk minimization with relative entropy regularization (Type-I and Type-II). The analysis of the solution unveils the following properties of $f$-divergences when used in the ERM-$f$DR problem: $i\bigl)$ $f$-divergence regularization forces the support of the solution to coincide with the support of the reference measure, which introduces a strong inductive bias that dominates the evidence provided by the training data; and $ii\bigl)$ any $f$-divergence regularization is equivalent to a different $f$-divergence regularization with an appropriate transformation of the empirical risk function.

Related papers

Generalization Error of $f$-Divergence Stabilized Algorithms via Duality [2.6024036282674587]
The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is extended to constrained optimization problems. A dual formulation of ERM-$f$DR is introduced, providing a computationally efficient method to derive the normalization function of the ERM-$f$DR solution.
arXiv Detail & Related papers (2025-02-20T13:21:01Z)
Sparsity via Sparse Group $k$-max Regularization [22.05774771336432]
In this paper, we propose a novel and concise regularization, namely the sparse group $k$-max regularization. We verify the effectiveness and flexibility of the proposed method through numerical experiments on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-02-13T14:41:28Z)
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values. We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z)
Variance Reduced Halpern Iteration for Finite-Sum Monotone Inclusions [18.086061048484616]
We study finite-sum monotone inclusion problems, which model broad classes of equilibrium problems. Our main contributions are variants of the classical Halpern iteration that employ variance reduction to obtain improved complexity guarantees. We argue that, up to poly-logarithmic factors, this complexity is unimprovable in the monotone Lipschitz setting.
arXiv Detail & Related papers (2023-10-04T17:24:45Z)
Analysis of the Relative Entropy Asymmetry in the Regularization of Empirical Risk Minimization [70.540936204654]
The effect of the relative entropy asymmetry is analyzed in the empirical risk minimization with relative entropy regularization (ERM-RER) problem. A novel regularization is introduced, coined Type-II regularization, that allows for solutions to the ERM-RER problem with a support that extends outside the support of the reference measure.
arXiv Detail & Related papers (2023-06-12T13:56:28Z)
A Robustness Analysis of Blind Source Separation [91.3755431537592]
Blind source separation (BSS) aims to recover an unobserved signal from its mixture $X=f(S)$ under the condition that the transformation $f$ is invertible but unknown. We present a general framework for analysing such violations and quantifying their impact on the blind recovery of $S$ from $X$. We show that a generic BSS-solution in response to general deviations from its defining structural assumptions can be profitably analysed in the form of explicit continuity guarantees.
arXiv Detail & Related papers (2023-03-17T16:30:51Z)
Empirical Risk Minimization with Relative Entropy Regularization [6.815730801645783]
The empirical risk minimization (ERM) problem with relative entropy regularization (ERM-RER) is investigated. The solution to this problem, if it exists, is shown to be a unique probability measure, mutually absolutely continuous with the reference measure. For a fixed dataset and under a specific condition, the empirical risk is shown to be a sub-Gaussian random variable.
arXiv Detail & Related papers (2022-11-12T09:41:02Z)
A Primal-Dual Approach to Solving Variational Inequalities with General Constraints [54.62996442406718]
Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities. We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(frac1sqrtK)$ when the operator is $L$-Lipschitz and monotone.
arXiv Detail & Related papers (2022-10-27T17:59:09Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data. We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model. We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.