Topology-aware Robust Optimization for Out-of-distribution
Generalization
- URL: http://arxiv.org/abs/2307.13943v1
- Date: Wed, 26 Jul 2023 03:48:37 GMT
- Title: Topology-aware Robust Optimization for Out-of-distribution
Generalization
- Authors: Fengchun Qiao, Xi Peng
- Abstract summary: Out-of-distribution (OOD) generalization is a challenging machine learning problem yet highly desirable in many high-stake applications.
We propose topology-aware robust optimization (TRO) that seamlessly integrates distributional topology in a principled optimization framework.
We theoretically demonstrate the effectiveness of our approach and empirically show that it significantly outperforms the state of the arts in a wide range of tasks including classification, regression, and semantic segmentation.
- Score: 18.436575017126323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Out-of-distribution (OOD) generalization is a challenging machine learning
problem yet highly desirable in many high-stake applications. Existing methods
suffer from overly pessimistic modeling with low generalization confidence. As
generalizing to arbitrary test distributions is impossible, we hypothesize that
further structure on the topology of distributions is crucial in developing
strong OOD resilience. To this end, we propose topology-aware robust
optimization (TRO) that seamlessly integrates distributional topology in a
principled optimization framework. More specifically, TRO solves two
optimization objectives: (1) Topology Learning which explores data manifold to
uncover the distributional topology; (2) Learning on Topology which exploits
the topology to constrain robust optimization for tightly-bounded
generalization risks. We theoretically demonstrate the effectiveness of our
approach and empirically show that it significantly outperforms the state of
the arts in a wide range of tasks including classification, regression, and
semantic segmentation. Moreover, we empirically find the data-driven
distributional topology is consistent with domain knowledge, enhancing the
explainability of our approach.
Related papers
- A Practical Theory of Generalization in Selectivity Learning [8.268822578361824]
query-driven machine learning models have emerged as a promising estimation technique for query selectivities.
We bridge the gaps between state-of-the-art (SOTA) theory based on the Probably Approximately Correct (PAC) learning framework.
We show that selectivity predictors induced by signed measures are learnable, which relaxes the reliance on probability measures in SOTA theory.
arXiv Detail & Related papers (2024-09-11T05:10:32Z) - Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms [15.473123662393169]
Deep neural networks (DNNs) show remarkable generalization properties.
The source of these capabilities remains elusive, defying the established statistical learning theory.
Recent studies have revealed that properties of training trajectories can be indicative of generalization.
arXiv Detail & Related papers (2024-07-11T17:56:03Z) - The Price of Implicit Bias in Adversarially Robust Generalization [25.944485657150146]
We study the implicit bias of optimization in robust empirical risk minimization (robust ERM)
We show that the implicit bias of optimization in robust ERM can significantly affect the robustness of the model.
arXiv Detail & Related papers (2024-06-07T14:44:37Z) - Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning [83.41487567765871]
Skipper is a model-based reinforcement learning framework.
It automatically generalizes the task given into smaller, more manageable subtasks.
It enables sparse decision-making and focused abstractions on the relevant parts of the environment.
arXiv Detail & Related papers (2023-09-30T02:25:18Z) - Federated Learning as Variational Inference: A Scalable Expectation
Propagation Approach [66.9033666087719]
This paper extends the inference view and describes a variational inference formulation of federated learning.
We apply FedEP on standard federated learning benchmarks and find that it outperforms strong baselines in terms of both convergence speed and accuracy.
arXiv Detail & Related papers (2023-02-08T17:58:11Z) - Distributionally Robust Fair Principal Components via Geodesic Descents [16.440434996206623]
In consequential domains such as college admission, healthcare and credit approval, it is imperative to take into account emerging criteria such as the fairness and the robustness of the learned projection.
We propose a distributionally robust optimization problem for principal component analysis which internalizes a fairness criterion in the objective function.
Our experimental results on real-world datasets show the merits of our proposed method over state-of-the-art baselines.
arXiv Detail & Related papers (2022-02-07T11:08:13Z) - Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data.
We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG)
Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z) - Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains.
We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue.
We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z) - Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO)
Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function.
Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Generalization Properties of Optimal Transport GANs with Latent
Distribution Learning [52.25145141639159]
We study how the interplay between the latent distribution and the complexity of the pushforward map affects performance.
Motivated by our analysis, we advocate learning the latent distribution as well as the pushforward map within the GAN paradigm.
arXiv Detail & Related papers (2020-07-29T07:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.