Rethinking Importance Weighting for Deep Learning under Distribution
Shift
- URL: http://arxiv.org/abs/2006.04662v2
- Date: Thu, 5 Nov 2020 09:40:18 GMT
- Title: Rethinking Importance Weighting for Deep Learning under Distribution
Shift
- Authors: Tongtong Fang, Nan Lu, Gang Niu, Masashi Sugiyama
- Abstract summary: Under distribution shift (DS) where the training data distribution differs from the test one, a powerful technique is importance weighting (IW) which handles DS in two separate steps.
In this paper, we rethink IW and theoretically show it suffers from a circular dependency.
We propose an end-to-end solution dynamic IW that iterates between WE and WC and combines them in a seamless manner.
- Score: 86.52964129830706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Under distribution shift (DS) where the training data distribution differs
from the test one, a powerful technique is importance weighting (IW) which
handles DS in two separate steps: weight estimation (WE) estimates the
test-over-training density ratio and weighted classification (WC) trains the
classifier from weighted training data. However, IW cannot work well on complex
data, since WE is incompatible with deep learning. In this paper, we rethink IW
and theoretically show it suffers from a circular dependency: we need not only
WE for WC, but also WC for WE where a trained deep classifier is used as the
feature extractor (FE). To cut off the dependency, we try to pretrain FE from
unweighted training data, which leads to biased FE. To overcome the bias, we
propose an end-to-end solution dynamic IW that iterates between WE and WC and
combines them in a seamless manner, and hence our WE can also enjoy deep
networks and stochastic optimizers indirectly. Experiments with two
representative types of DS on three popular datasets show that our dynamic IW
compares favorably with state-of-the-art methods.
Related papers
- DeNetDM: Debiasing by Network Depth Modulation [6.550893772143]
We present DeNetDM, a novel debiasing method that uses network depth modulation as a way of developing robustness to spurious correlations.
Our method requires no bias annotations or explicit data augmentation while performing on par with approaches that require either or both.
We demonstrate that DeNetDM outperforms existing debiasing techniques on both synthetic and real-world datasets by 5%.
arXiv Detail & Related papers (2024-03-28T22:17:19Z) - FedUV: Uniformity and Variance for Heterogeneous Federated Learning [5.9330433627374815]
Federated learning is a promising framework to train neural networks with widely distributed data.
Recent work has shown this is due to the final layer of the network being most prone to local bias.
We investigate the training dynamics of the classifier by applying SVD to the weights motivated by the observation that freezing weights results in constant singular values.
arXiv Detail & Related papers (2024-02-27T15:53:15Z) - Out-of-Distribution Detection with Hilbert-Schmidt Independence
Optimization [114.43504951058796]
Outlier detection tasks have been playing a critical role in AI safety.
Deep neural network classifiers usually tend to incorrectly classify out-of-distribution (OOD) inputs into in-distribution classes with high confidence.
We propose an alternative probabilistic paradigm that is both practically useful and theoretically viable for the OOD detection tasks.
arXiv Detail & Related papers (2022-09-26T15:59:55Z) - Learning to Re-weight Examples with Optimal Transport for Imbalanced
Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models.
One of the most widely-used approaches for tackling imbalanced data is re-weighting.
We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Deep Stable Learning for Out-Of-Distribution Generalization [27.437046504902938]
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution.
Eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models.
We propose to address this problem by removing the dependencies between features via learning weights for training samples.
arXiv Detail & Related papers (2021-04-16T03:54:21Z) - Generalized ODIN: Detecting Out-of-distribution Image without Learning
from Out-of-distribution Data [87.61504710345528]
We propose two strategies for freeing a neural network from tuning with OoD data, while improving its OoD detection performance.
We specifically propose to decompose confidence scoring as well as a modified input pre-processing method.
Our further analysis on a larger scale image dataset shows that the two types of distribution shifts, specifically semantic shift and non-semantic shift, present a significant difference.
arXiv Detail & Related papers (2020-02-26T04:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.