Related papers: Distributionally Robust Learning

Distributionally Robust Learning

URL: http://arxiv.org/abs/2108.08993v1
Date: Fri, 20 Aug 2021 04:14:18 GMT
Title: Distributionally Robust Learning
Authors: Ruidi Chen, Ioannis Ch. Paschalidis
Abstract summary: This book develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data. A tractable DRO relaxation for each problem is being derived, establishing a connection between bounds and regularization. Beyond theory, we include numerical experiments and case studies using synthetic and real data.
Score: 11.916893752969429
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This monograph develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data using Distributionally Robust Optimization (DRO) under the Wasserstein metric. Beginning with fundamental properties of the Wasserstein metric and the DRO formulation, we explore duality to arrive at tractable formulations and develop finite-sample, as well as asymptotic, performance guarantees. We consider a series of learning problems, including (i) distributionally robust linear regression; (ii) distributionally robust regression with group structure in the predictors; (iii) distributionally robust multi-output regression and multiclass classification, (iv) optimal decision making that combines distributionally robust regression with nearest-neighbor estimation; (v) distributionally robust semi-supervised learning, and (vi) distributionally robust reinforcement learning. A tractable DRO relaxation for each problem is being derived, establishing a connection between robustness and regularization, and obtaining bounds on the prediction and estimation errors of the solution. Beyond theory, we include numerical experiments and case studies using synthetic and real data. The real data experiments are all associated with various health informatics problems, an application area which provided the initial impetus for this work.

Related papers

Multi-environment Invariance Learning with Missing Data [0.0]
In this work, we establish non-asymptotic guarantees on variable selection property and $ell$ error convergence rates.<n>We evaluate the performance of the new estimator through extensive simulations and demonstrate its application using the UCI Bike Sharing dataset.
arXiv Detail & Related papers (2026-01-12T06:30:58Z)
Distributionally Robust Federated Learning with Outlier Resilience [8.69285602685459]
We study distributionally robust federated learning with explicit outlier resilience.<n>We reformulate the problem as a tractable Lagrangian penalty optimization, which admits robustness certificates.<n>Building on this reformulation, we propose the distributionally outlier-robust federated learning algorithm and establish its convergence guarantees.
arXiv Detail & Related papers (2025-09-29T08:42:12Z)
Distributionally Robust Optimization with Adversarial Data Contamination [36.409282287280185]
We focus on optimizing Wasserstein-1 DRO objectives for generalized linear models with convex Lipschitz loss functions.<n>Our primary contribution lies in a novel modeling framework that integrates robustness against training data contamination with robustness against distributional shifts.<n>This work establishes the first rigorous guarantees, supported by efficient computation, for learning under the dual challenges of data contamination and distributional shifts.
arXiv Detail & Related papers (2025-07-14T18:34:10Z)
Statistical Inference for Conditional Group Distributionally Robust Optimization with Cross-Entropy Loss [9.054486124506521]
We study multi-source unsupervised domain adaptation, where labeled data are drawn from multiple source domains and only unlabeled data from a target domain.<n>We propose a novel Conditional Conditional Optimization (CG-DRO) framework that learns a classifier by minimizing the worst-case cross-entropy loss over the convex combinations of the conditional outcome distributions from the sources.<n>We establish fast statistical convergence rates for the estimator by constructing two surrogate minimax optimization problems that serve as theoretical bridges.
arXiv Detail & Related papers (2025-07-14T04:21:23Z)
Distributionally and Adversarially Robust Logistic Regression via Intersecting Wasserstein Balls [8.720733751119994]
Adversarially robust optimization (ARO) has become the de facto standard for training models to defend against adversarial attacks during testing. Despite their robustness, these models often suffer from severe overfitting. We propose two approaches to replace the empirical distribution in training with: (i) a worst-case distribution within an ambiguity set; or (ii) a mixture of the empirical distribution with one derived from an auxiliary dataset.
arXiv Detail & Related papers (2024-07-18T15:59:37Z)
The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model. We introduce three robustness indicators and conduct experiments across diverse robust datasets. Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z)
Distributionally Robust Model-based Reinforcement Learning with Large State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Learning Against Distributional Uncertainty: On the Trade-off Between Robustness and Specificity [24.874664446700272]
This paper studies a new framework that unifies the three approaches and that addresses the two challenges mentioned above. The properties (e.g., consistency and normalities), non-asymptotic properties (e.g., unbiasedness and error bound), and a Monte-Carlo-based solution method of the proposed model are studied.
arXiv Detail & Related papers (2023-01-31T11:33:18Z)
Robust Direct Learning for Causal Data Fusion [14.462235940634969]
We provide a framework for integrating multi-source data that separates the treatment effect from other nuisance functions. We also propose a causal information-aware weighting function motivated by theoretical insights from the semiparametric efficiency theory.
arXiv Detail & Related papers (2022-11-01T03:33:22Z)
RIGID: Robust Linear Regression with Missing Data [7.638042073679073]
We present a robust framework to perform linear regression with missing entries in the features. We show that the proposed formulation, which naturally takes into account the dependency between different variables, reduces to a convex program. In addition to a detailed analysis, we also analyze the behavior of the proposed framework, and present technical discussions.
arXiv Detail & Related papers (2022-05-26T21:10:17Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO) Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function. Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z)
Residuals-based distributionally robust optimization with covariate information [0.0]
We consider data-driven approaches that integrate a machine learning prediction model within distributionally robust optimization (DRO) Our framework is flexible in the sense that it can accommodate a variety of learning setups and DRO ambiguity sets.
arXiv Detail & Related papers (2020-12-02T11:21:34Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function. It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.