Related papers: Learning from Samples: Inverse Problems over measures via Sharpened Fenchel-Young Losses

Learning from Samples: Inverse Problems over measures via Sharpened Fenchel-Young Losses

URL: http://arxiv.org/abs/2505.07124v2
Date: Sun, 17 Aug 2025 16:28:29 GMT
Title: Learning from Samples: Inverse Problems over measures via Sharpened Fenchel-Young Losses
Authors: Francisco Andrade, Gabriel Peyré, Clarice Poon,
Abstract summary: We introduce a new class of loss functions, called sharpened Fenchel-Young losses, which measure the sub-optimality gap of the optimization problem over the space of probability measures.<n>We provide explicit stability guarantees for two relevant settings in the context of optimal transport.<n>We present optimization algorithms specifically tailored to efficiently solve iUOT and iJKO problems.
Score: 20.246040671823557
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Estimating parameters from samples of an optimal probability distribution is essential in applications ranging from socio-economic modeling to biological system analysis. In these settings, the probability distribution arises as the solution to an optimization problem that captures either static interactions among agents or the dynamic evolution of a system over time. We introduce a general methodology based on a new class of loss functions, called sharpened Fenchel-Young losses, which measure the sub-optimality gap of the optimization problem over the space of probability measures. We provide explicit stability guarantees for two relevant settings in the context of optimal transport: The first is inverse unbalanced optimal transport (iUOT) with entropic regularization, where the parameters to estimate are cost functions that govern transport computations; this method has applications such as link prediction in machine learning. The second is inverse gradient flow (iJKO), where the objective is to recover a potential function that drives the evolution of a probability distribution via the Jordan-Kinderlehrer-Otto (JKO) time-discretization scheme; this is particularly relevant for understanding cell population dynamics in single-cell genomics. We also establish source conditions to ensure stability of our method under mirror stratifiable regularizers (such as l1 or nuclear norm) that promote structure. Finally, we present optimization algorithms specifically tailored to efficiently solve iUOT and iJKO problems. We validate our approach through numerical experiments on Gaussian distributions, where closed-form solutions are available, to demonstrate the practical performance of our methods.

Related papers

Variational Entropic Optimal Transport [67.76725267984578]
We propose Variational Entropic Optimal Transport (VarEOT) for domain translation problems.<n>VarEOT is based on an exact variational reformulation of the log-partition $log mathbbE[exp(cdot)$ as a tractable generalization over an auxiliary positive normalizer.<n> Experiments on synthetic data and unpaired image-to-image translation demonstrate competitive or improved translation quality.
arXiv Detail & Related papers (2026-02-02T15:48:44Z)
Stochastic Control Methods for Optimization [0.0]
In the Euclidean setting, we analyze the problem of regularized control problems.<n>For global measures, we formulate a regularized mean-field problem characterized by a master-field problem.
arXiv Detail & Related papers (2026-01-03T17:55:26Z)
Online Inference of Constrained Optimization: Primal-Dual Optimality and Sequential Quadratic Programming [55.848340925419286]
We study online statistical inference for the solutions of quadratic optimization problems with equality and inequality constraints.<n>We develop a sequential programming (SSQP) method to solve these problems, where the step direction is computed by sequentially performing an approximation of the objective and a linear approximation of the constraints.<n>We show that our method global almost moving-average convergence and exhibits local normality with an optimal primal-dual limiting matrix in the sense of Hjek and Le Cam.
arXiv Detail & Related papers (2025-11-27T06:16:17Z)
Stochastic Optimization with Optimal Importance Sampling [49.484190237840714]
We propose an iterative-based algorithm that jointly updates the decision and the IS distribution without requiring time-scale separation between the two.<n>Our method achieves the lowest possible variable variance and guarantees global convergence under convexity of the objective and mild assumptions on the IS distribution family.
arXiv Detail & Related papers (2025-04-04T16:10:18Z)
Semiparametric Counterfactual Regression [2.356908851188234]
We propose a doubly robust-style estimator for counterfactual regression within a generalizable framework.<n>Our approach uses incremental interventions to enhance adaptability while maintaining with standard methods.<n>Our analysis shows that the proposed estimators can achieve $sqrn$-consistency and normality for a broad class of problems.
arXiv Detail & Related papers (2025-04-03T15:32:26Z)
Steering Large Agent Populations using Mean-Field Schrodinger Bridges with Gaussian Mixture Models [13.03355083378673]
Mean-Field Schrodinger Bridge (MFSB) problem is an optimization problem aiming to find the minimum effort control policy.<n>In the context of multi-agent control, the objective is to control the configuration of a swarm of identical, interacting cooperative agents.
arXiv Detail & Related papers (2025-03-31T04:01:04Z)
On improving generalization in a class of learning problems with the method of small parameters for weakly-controlled optimal gradient systems [0.0]
We consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term.<n>Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems.<n>We also provide an estimate for the rate of convergence for such approximate optimal solutions.
arXiv Detail & Related papers (2024-12-11T20:50:29Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points. To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject. To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z)
Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [61.580419063416734]
A recent stream of structured learning approaches has improved the practical state of the art for a range of optimization problems. The key idea is to exploit the statistical distribution over instances instead of dealing with instances separately. In this article, we investigate methods that smooth the risk by perturbing the policy, which eases optimization and improves the generalization error.
arXiv Detail & Related papers (2024-07-24T12:00:30Z)
Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing. We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z)
A framework for bilevel optimization that enables stochastic and global variance reduction algorithms [21.67411847762289]
Bilevel optimization is a problem of minimizing a value function which involves the arg-minimum of another function.<n>We introduce a novel framework, in which the solution of the inner problem, the solution of the linear system, and the main variable evolve at the same time.<n>We demonstrate that SABA, an adaptation of the celebrated SAGA algorithm in our framework, has $O(frac1T)$ convergence rate.
arXiv Detail & Related papers (2022-01-31T18:17:25Z)
Integrated Conditional Estimation-Optimization [6.037383467521294]
Many real-world optimization problems uncertain parameters with probability can be estimated using contextual feature information. In contrast to the standard approach of estimating the distribution of uncertain parameters, we propose an integrated conditional estimation approach. We show that our ICEO approach is theally consistent under moderate conditions.
arXiv Detail & Related papers (2021-10-24T04:49:35Z)
Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution [8.465228064780748]
We prove that computing the Wasserstein distance between a discrete probability measure supported on two points is already #P-hard. We introduce a distributionally robust dual optimal transport problem whose objective function is smoothed with the most adverse disturbance distributions. We show that smoothing the dual objective function is equivalent to regularizing the primal objective function.
arXiv Detail & Related papers (2021-03-10T18:53:59Z)
Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization [106.70006655990176]
A distributional optimization problem arises widely in machine learning and statistics. We propose a novel particle-based algorithm, dubbed as variational transport, which approximately performs Wasserstein gradient descent. We prove that when the objective function satisfies a functional version of the Polyak-Lojasiewicz (PL) (Polyak, 1963) and smoothness conditions, variational transport converges linearly.
arXiv Detail & Related papers (2020-12-21T18:33:13Z)
Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.