Removing Undesirable Feature Contributions Using Out-of-Distribution
Data
- URL: http://arxiv.org/abs/2101.06639v2
- Date: Wed, 3 Mar 2021 05:40:51 GMT
- Title: Removing Undesirable Feature Contributions Using Out-of-Distribution
Data
- Authors: Saehyung Lee, Changhwa Park, Hyungyu Lee, Jihun Yi, Jonghyun Lee,
Sungroh Yoon
- Abstract summary: We propose a data augmentation method to improve generalization in both adversarial and standard learning.
The proposed method can further improve the existing state-of-the-art adversarial training.
- Score: 20.437871747430826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Several data augmentation methods deploy unlabeled-in-distribution (UID) data
to bridge the gap between the training and inference of neural networks.
However, these methods have clear limitations in terms of availability of UID
data and dependence of algorithms on pseudo-labels. Herein, we propose a data
augmentation method to improve generalization in both adversarial and standard
learning by using out-of-distribution (OOD) data that are devoid of the
abovementioned issues. We show how to improve generalization theoretically
using OOD data in each learning scenario and complement our theoretical
analysis with experiments on CIFAR-10, CIFAR-100, and a subset of ImageNet. The
results indicate that undesirable features are shared even among image data
that seem to have little correlation from a human point of view. We also
present the advantages of the proposed method through comparison with other
data augmentation methods, which can be used in the absence of UID data.
Furthermore, we demonstrate that the proposed method can further improve the
existing state-of-the-art adversarial training.
Related papers
- Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples [3.4546761246181696]
We propose a self-supervised contrastive learning approach to fully exploit a large amount of unlabeled data.
The results show that self-supervised contrastive learning significantly improves classification accuracy.
arXiv Detail & Related papers (2024-08-03T22:33:13Z) - Cross-feature Contrastive Loss for Decentralized Deep Learning on
Heterogeneous Data [8.946847190099206]
We present a novel approach for decentralized learning on heterogeneous data.
Cross-features for a pair of neighboring agents are the features obtained from the data of an agent with respect to the model parameters of the other agent.
Our experiments show that the proposed method achieves superior performance (0.2-4% improvement in test accuracy) compared to other existing techniques for decentralized learning on heterogeneous data.
arXiv Detail & Related papers (2023-10-24T14:48:23Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Leachable Component Clustering [10.377914682543903]
In this work, a novel approach to clustering of incomplete data, termed leachable component clustering, is proposed.
The proposed method handles data imputation with Bayes alignment, and collects the lost patterns in theory.
Experiments on several artificial incomplete data sets demonstrate that, the proposed method is able to present superior performance compared with other state-of-the-art algorithms.
arXiv Detail & Related papers (2022-08-28T13:13:17Z) - Incorporating Semi-Supervised and Positive-Unlabeled Learning for
Boosting Full Reference Image Quality Assessment [73.61888777504377]
Full-reference (FR) image quality assessment (IQA) evaluates the visual quality of a distorted image by measuring its perceptual difference with pristine-quality reference.
Unlabeled data can be easily collected from an image degradation or restoration process, making it encouraging to exploit unlabeled training data to boost FR-IQA performance.
In this paper, we suggest to incorporate semi-supervised and positive-unlabeled (PU) learning for exploiting unlabeled data while mitigating the adverse effect of outliers.
arXiv Detail & Related papers (2022-04-19T09:10:06Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Deep Stable Learning for Out-Of-Distribution Generalization [27.437046504902938]
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution.
Eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models.
We propose to address this problem by removing the dependencies between features via learning weights for training samples.
arXiv Detail & Related papers (2021-04-16T03:54:21Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.