Optimal Transport Model Distributional Robustness
- URL: http://arxiv.org/abs/2306.04178v2
- Date: Wed, 1 Nov 2023 05:55:33 GMT
- Title: Optimal Transport Model Distributional Robustness
- Authors: Van-Anh Nguyen, Trung Le, Anh Tuan Bui, Thanh-Toan Do, and Dinh Phung
- Abstract summary: Previous works have mainly focused on exploiting distributional robustness in the data space.
We develop theories that enable us to learn the optimal robust center model distribution.
Our framework can be seen as a probabilistic extension of Sharpness-Aware Minimization.
- Score: 33.24747882707421
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributional robustness is a promising framework for training deep learning
models that are less vulnerable to adversarial examples and data distribution
shifts. Previous works have mainly focused on exploiting distributional
robustness in the data space. In this work, we explore an optimal
transport-based distributional robustness framework in model spaces.
Specifically, we examine a model distribution within a Wasserstein ball
centered on a given model distribution that maximizes the loss. We have
developed theories that enable us to learn the optimal robust center model
distribution. Interestingly, our developed theories allow us to flexibly
incorporate the concept of sharpness awareness into training, whether it's a
single model, ensemble models, or Bayesian Neural Networks, by considering
specific forms of the center model distribution. These forms include a Dirac
delta distribution over a single model, a uniform distribution over several
models, and a general Bayesian Neural Network. Furthermore, we demonstrate that
Sharpness-Aware Minimization (SAM) is a specific case of our framework when
using a Dirac delta distribution over a single model, while our framework can
be seen as a probabilistic extension of SAM. To validate the effectiveness of
our framework in the aforementioned settings, we conducted extensive
experiments, and the results reveal remarkable improvements compared to the
baselines.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset.
We develop constrained diffusion models by imposing diffusion constraints based on desired distributions.
We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z) - Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models [54.132297393662654]
We introduce a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL.
We demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models.
arXiv Detail & Related papers (2024-05-30T03:57:29Z) - Latent Schr{รถ}dinger Bridge Diffusion Model for Generative Learning [7.13080924844185]
We introduce a novel generative learning methodology utilizing the Schr"odinger bridge diffusion model in latent space.
We develop a diffusion model within the latent space utilizing the Schr"odinger bridge framework.
arXiv Detail & Related papers (2024-04-20T07:38:48Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - Enhancing Robustness of Foundation Model Representations under
Provenance-related Distribution Shifts [8.298173603769063]
We examine the stability of models based on foundation models under distribution shift.
We focus on confounding by provenance, a form of distribution shift that emerges in the context of multi-institutional datasets.
Results indicate that while foundation models do show some out-of-the-box robustness to confounding-by-provenance related distribution shifts, this can be improved through adjustment.
arXiv Detail & Related papers (2023-12-09T02:02:45Z) - Distributionally Robust Post-hoc Classifiers under Prior Shifts [31.237674771958165]
We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors.
We present an extremely lightweight post-hoc approach that performs scaling adjustments to predictions from a pre-trained model.
arXiv Detail & Related papers (2023-09-16T00:54:57Z) - Siamese Neural Network with Joint Bayesian Model Structure for Speaker
Verification [54.96267179988487]
We propose a novel Siamese neural network (SiamNN) for speaker verification.
Joint distribution of samples is first formulated based on a joint Bayesian (JB) based generative model.
We further train the model parameters with the pair-wised samples as a binary discrimination task for speaker verification.
arXiv Detail & Related papers (2021-04-07T09:17:29Z) - Achieving Efficiency in Black Box Simulation of Distribution Tails with
Self-structuring Importance Samplers [1.6114012813668934]
The paper presents a novel Importance Sampling (IS) scheme for estimating distribution of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc.
arXiv Detail & Related papers (2021-02-14T03:37:22Z) - Generalization Properties of Optimal Transport GANs with Latent
Distribution Learning [52.25145141639159]
We study how the interplay between the latent distribution and the complexity of the pushforward map affects performance.
Motivated by our analysis, we advocate learning the latent distribution as well as the pushforward map within the GAN paradigm.
arXiv Detail & Related papers (2020-07-29T07:31:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.