Iterative Sampling Methods for Sinkhorn Distributionally Robust Optimization
- URL: http://arxiv.org/abs/2512.12550v1
- Date: Sun, 14 Dec 2025 04:42:51 GMT
- Title: Iterative Sampling Methods for Sinkhorn Distributionally Robust Optimization
- Authors: Jie Wang,
- Abstract summary: Distributionally robust optimization (DRO) has emerged as a powerful paradigm for reliable decision-making under uncertainty.<n>This paper focuses on DRO with ambiguity sets defined via the Sinkhorn discrepancy: an entropy-regularized Wasserstein distance.<n>We study Sinkhorn DRO from the primal perspective, by reformulating it as a bilevel program with several infinite-dimensional lower-level subproblems over probability space.
- Score: 2.345146665577353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributionally robust optimization (DRO) has emerged as a powerful paradigm for reliable decision-making under uncertainty. This paper focuses on DRO with ambiguity sets defined via the Sinkhorn discrepancy: an entropy-regularized Wasserstein distance, referred to as Sinkhorn DRO. Existing work primarily addresses Sinkhorn DRO from a dual perspective, leveraging its formulation as a conditional stochastic optimization problem, for which many stochastic gradient methods are applicable. However, the theoretical analyses of such methods often rely on the boundedness of the loss function, and it is indirect to obtain the worst-case distribution associated with Sinkhorn DRO. In contrast, we study Sinkhorn DRO from the primal perspective, by reformulating it as a bilevel program with several infinite-dimensional lower-level subproblems over probability space. This formulation enables us to simultaneously obtain the optimal robust decision and the worst-case distribution, which is valuable in practical settings, such as generating stress-test scenarios or designing robust learning algorithms. We propose both double-loop and single-loop sampling-based algorithms with theoretical guarantees to solve this bilevel program. Finally, we demonstrate the effectiveness of our approach through a numerical study on adversarial classification.
Related papers
- Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach [2.8445258546547625]
We introduce a framework for contextual distributionally robust optimization (DRO)<n>We first introduce the causal Sinkhorn discrepancy (CSD), an entropy-regularized causal Wasserstein distance.<n>We derive a contextual DRO model with a CSD-based ambiguity set, termed Causal Sinkhorn DRO (Causal-SDRO)<n>We propose the Soft Forest Regression (SRF) decision rule, which approximates optimal policies within arbitrary measurable function spaces.
arXiv Detail & Related papers (2026-01-16T06:18:22Z) - Nested Stochastic Algorithm for Generalized Sinkhorn distance-Regularized Distributionally Robust Optimization [4.989068568135242]
Distributionally robust shift optimization (DRO) is a powerful technique to robust models against data distribution.<n>This paper aims to solve regularized non DRO problems, where the uncertainty is modeled by a so-called generalized approximation function.
arXiv Detail & Related papers (2025-03-29T01:01:02Z) - Learning from Noisy Labels via Conditional Distributionally Robust Optimization [5.85767711644773]
crowdsourcing has emerged as a practical solution for labeling large datasets.
It presents a significant challenge in learning accurate models due to noisy labels from annotators with varying levels of expertise.
arXiv Detail & Related papers (2024-11-26T05:03:26Z) - Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization [12.311794669976047]
We present Drago, a primal-dual algorithm that combines cyclic and randomized components with a carefully regularized primal update to achieve dual variance reduction.<n>Owing to its design, Drago enjoys a state-of-the-art linear convergence rate on strongly convex-strongly concave DRO problems with a fine-grained dependency on primal and dual condition numbers.
arXiv Detail & Related papers (2024-03-16T02:06:14Z) - A Near-Optimal Single-Loop Stochastic Algorithm for Convex Finite-Sum Coupled Compositional Optimization [53.14532968909759]
We introduce an efficient single-loop primal-dual block-coordinate algorithm called ALEXR.<n>We establish the convergence rates of ALEXR in both convex and strongly convex cases under smoothness and non-smoothness conditions.<n> Experimental results on GDRO and partial Area Under the ROC Curve for cFCCO demonstrate the promising performance of our algorithm.
arXiv Detail & Related papers (2023-12-04T19:00:07Z) - Stochastic Constrained DRO with a Complexity Independent of Sample Size [38.56406595022129]
We propose and analyze algorithms that apply to both non- and convex losses for solving Kullback divergence constrained DRO problem.
We establish a nearly optimal complexity for finding an $$ilon stationary solution for non- losses and an optimal batch complexity for finding an optimal solution for broad applications.
arXiv Detail & Related papers (2022-10-11T19:11:19Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Distributionally Robust Bayesian Optimization with $\varphi$-divergences [45.48814080654241]
We consider robustness against data-shift in $varphi$-divergences, which subsumes many popular choices, such as the Total Variation, and the extant Kullback-Leibler divergence.
We show that the DRO-BO problem in this setting is equivalent to a finite-dimensional optimization problem which, even in the continuous context setting, can be easily implemented with provable sublinear regret bounds.
arXiv Detail & Related papers (2022-03-04T04:34:52Z) - When AUC meets DRO: Optimizing Partial AUC for Deep Learning with
Non-Convex Convergence Guarantee [51.527543027813344]
We propose systematic and efficient gradient-based methods for both one-way and two-way partial AUC (pAUC)
For both one-way and two-way pAUC, we propose two algorithms and prove their convergence for optimizing their two formulations, respectively.
arXiv Detail & Related papers (2022-03-01T01:59:53Z) - Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis [16.499651513178012]
Distributionally robust optimization (DRO) is a widely-used approach to learn models that are robust against distribution shift.
We provide non-asymptotic convergence guarantees even though the objective function is possibly prominent nonsmooth- and has normalized gradient descent.
arXiv Detail & Related papers (2021-10-24T14:56:38Z) - Sinkhorn Distributionally Robust Optimization [18.46110328123008]
Sinkhorn distance is a variant of Wasserstein distance based on entropic regularization.<n>We derive a convex programming dual reformulation for general nominal distributions, transport costs, and loss functions.
arXiv Detail & Related papers (2021-09-24T12:40:48Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Distributionally Robust Bayesian Optimization [121.71766171427433]
We present a novel distributionally robust Bayesian optimization algorithm (DRBO) for zeroth-order, noisy optimization.
Our algorithm provably obtains sub-linear robust regret in various settings.
We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2020-02-20T22:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.