A Dual Optimization View to Empirical Risk Minimization with   f-Divergence Regularization
        - URL: http://arxiv.org/abs/2508.03314v1
 - Date: Tue, 05 Aug 2025 10:48:40 GMT
 - Title: A Dual Optimization View to Empirical Risk Minimization with   f-Divergence Regularization
 - Authors: Francisco Daunas, IƱaki Esnaola, Samir M. Perlaza, 
 - Abstract summary: The solution of the dual optimization problem to the ERM-fDR is connected to the notion of normalization function introduced as an implicit function.<n>The Legendre-Fenchel transform and the implicit function theorem provide a nonlinear ODE expression to the normalization function.
 - Score: 1.024113475677323
 - License: http://creativecommons.org/licenses/by/4.0/
 - Abstract:   The dual formulation of empirical risk minimization with f-divergence regularization (ERM-fDR) is introduced. The solution of the dual optimization problem to the ERM-fDR is connected to the notion of normalization function introduced as an implicit function. This dual approach leverages the Legendre-Fenchel transform and the implicit function theorem to provide a nonlinear ODE expression to the normalization function. Furthermore, the nonlinear ODE expression and its properties provide a computationally efficient method to calculate the normalization function of the ERM-fDR solution under a mild condition. 
 
       
      
        Related papers
        - Generalization Error of $f$-Divergence Stabilized Algorithms via Duality [2.6024036282674587]
The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is extended to constrained optimization problems.<n>A dual formulation of ERM-$f$DR is introduced, providing a computationally efficient method to derive the normalization function of the ERM-$f$DR solution.
arXiv  Detail & Related papers  (2025-02-20T13:21:01Z) - Alternating Minimization Schemes for Computing   Rate-Distortion-Perception Functions with $f$-Divergence Perception   Constraints [10.564071872770146]
We study the computation of the rate-distortion-perception function (RDPF) for discrete memoryless sources.
We characterize the optimal parametric solutions.
We provide sufficient conditions on the distortion and the perception constraints.
arXiv  Detail & Related papers  (2024-08-27T12:50:12Z) - Double Duality: Variational Primal-Dual Policy Optimization for
  Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv  Detail & Related papers  (2024-02-16T16:35:18Z) - Equivalence of the Empirical Risk Minimization to Regularization on the   Family of f-Divergences [45.935798913942904]
The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented.
 Examples of the solution for particular choices of the function $f$ are presented.
arXiv  Detail & Related papers  (2024-02-01T11:12:00Z) - Analysis of the Relative Entropy Asymmetry in the Regularization of
  Empirical Risk Minimization [70.540936204654]
The effect of the relative entropy asymmetry is analyzed in the empirical risk minimization with relative entropy regularization (ERM-RER) problem.
A novel regularization is introduced, coined Type-II regularization, that allows for solutions to the ERM-RER problem with a support that extends outside the support of the reference measure.
arXiv  Detail & Related papers  (2023-06-12T13:56:28Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv  Detail & Related papers  (2022-11-30T18:59:27Z) - Log-based Sparse Nonnegative Matrix Factorization for Data
  Representation [55.72494900138061]
Nonnegative matrix factorization (NMF) has been widely studied in recent years due to its effectiveness in representing nonnegative data with parts-based representations.
We propose a new NMF method with log-norm imposed on the factor matrices to enhance the sparseness.
A novel column-wisely sparse norm, named $ell_2,log$-(pseudo) norm, is proposed to enhance the robustness of the proposed method.
arXiv  Detail & Related papers  (2022-04-22T11:38:10Z) - A Dual Approach to Constrained Markov Decision Processes with Entropy
  Regularization [7.483040617090451]
We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization.
Our theoretical analysis shows that its Lagrangian dual function is smooth and the Lagrangian duality gap can be decomposed into the primality gap and the constraint violation.
arXiv  Detail & Related papers  (2021-10-17T21:26:40Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv  Detail & Related papers  (2021-10-13T17:47:39Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
  Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv  Detail & Related papers  (2020-07-16T13:27:47Z) - Solving high-dimensional eigenvalue problems using deep neural networks:
  A diffusion Monte Carlo like approach [14.558626910178127]
The eigenvalue problem is reformulated as a fixed point problem of the semigroup flow induced by the operator.
The method shares a similar spirit with diffusion Monte Carlo but augments a direct approximation to the eigenfunction through neural-network ansatz.
Our approach is able to provide accurate eigenvalue and eigenfunction approximations in several numerical examples.
arXiv  Detail & Related papers  (2020-02-07T03:08:31Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.