Deep Stable Learning for Out-Of-Distribution Generalization
- URL: http://arxiv.org/abs/2104.07876v1
- Date: Fri, 16 Apr 2021 03:54:21 GMT
- Title: Deep Stable Learning for Out-Of-Distribution Generalization
- Authors: Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, Zheyan Shen
- Abstract summary: Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution.
Eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models.
We propose to address this problem by removing the dependencies between features via learning weights for training samples.
- Score: 27.437046504902938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Approaches based on deep neural networks have achieved striking performance
when testing data and training data share similar distribution, but can
significantly fail otherwise. Therefore, eliminating the impact of distribution
shifts between training and testing data is crucial for building
performance-promising deep models. Conventional methods assume either the known
heterogeneity of training data (e.g. domain labels) or the approximately equal
capacities of different domains. In this paper, we consider a more challenging
case where neither of the above assumptions holds. We propose to address this
problem by removing the dependencies between features via learning weights for
training samples, which helps deep models get rid of spurious correlations and,
in turn, concentrate more on the true connection between discriminative
features and labels. Extensive experiments clearly demonstrate the
effectiveness of our method on multiple distribution generalization benchmarks
compared with state-of-the-art counterparts. Through extensive experiments on
distribution generalization benchmarks including PACS, VLCS, MNIST-M, and NICO,
we show the effectiveness of our method compared with state-of-the-art
counterparts.
Related papers
- Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model [22.39558434131574]
Existing data attribution methods for diffusion models typically quantify the contribution of a training sample.
We argue that the direct usage of diffusion loss cannot represent such a contribution accurately due to the calculation of diffusion loss.
We aim to measure the direct comparison between predicted distributions with an attribution score to analyse the training sample importance.
arXiv Detail & Related papers (2024-10-24T10:58:17Z) - Simple Ingredients for Offline Reinforcement Learning [86.1988266277766]
offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task.
We show that existing methods struggle with diverse data: their performance considerably deteriorates as data collected for related but different tasks is simply added to the offline buffer.
We show that scale, more than algorithmic considerations, is the key factor influencing performance.
arXiv Detail & Related papers (2024-03-19T18:57:53Z) - A Comparative Evaluation of FedAvg and Per-FedAvg Algorithms for
Dirichlet Distributed Heterogeneous Data [2.5507252967536522]
We investigate Federated Learning (FL), a paradigm of machine learning that allows for decentralized model training on devices without sharing raw data.
We compare two strategies within this paradigm: Federated Averaging (FedAvg) and Personalized Federated Averaging (Per-FedAvg)
Our results provide insights into the development of more effective and efficient machine learning strategies in a decentralized setting.
arXiv Detail & Related papers (2023-09-03T21:33:15Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Federated XGBoost on Sample-Wise Non-IID Data [8.49189353769386]
Decision tree-based models, in particular XGBoost, can handle non-IID data.
This paper investigates the effects of how Federated XGBoost is impacted by non-IID distributions.
arXiv Detail & Related papers (2022-09-03T06:14:20Z) - Is it all a cluster game? -- Exploring Out-of-Distribution Detection
based on Clustering in the Embedding Space [7.856998585396422]
It is essential for safety-critical applications of deep neural networks to determine when new inputs are significantly different from the training distribution.
We study the structure and separation of clusters in the embedding space and find that supervised contrastive learning leads to well-separated clusters.
In our analysis of different training methods, clustering strategies, distance metrics, and thresholding approaches, we observe that there is no clear winner.
arXiv Detail & Related papers (2022-03-16T11:22:23Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.