Learning Stable Classifiers by Transferring Unstable Features
- URL: http://arxiv.org/abs/2106.07847v1
- Date: Tue, 15 Jun 2021 02:41:12 GMT
- Title: Learning Stable Classifiers by Transferring Unstable Features
- Authors: Yujia Bao, Shiyu Chang, Regina Barzilay
- Abstract summary: We study transfer learning in the presence of spurious correlations.
We experimentally demonstrate that directly transferring the stable feature extractor learned on the source task may not eliminate these biases for the target task.
We hypothesize that the unstable features in the source task and those in the target task are directly related.
- Score: 59.06169363181417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study transfer learning in the presence of spurious correlations. We
experimentally demonstrate that directly transferring the stable feature
extractor learned on the source task may not eliminate these biases for the
target task. However, we hypothesize that the unstable features in the source
task and those in the target task are directly related. By explicitly informing
the target classifier of the source task's unstable features, we can regularize
the biases in the target task. Specifically, we derive a representation that
encodes the unstable features by contrasting different data environments in the
source task. On the target task, we cluster data from this representation, and
achieve robustness by minimizing the worst-case risk across all clusters. We
evaluate our method on both text and image classifications. Empirical results
demonstrate that our algorithm is able to maintain robustness on the target
task, outperforming the best baseline by 22.9% in absolute accuracy across 12
transfer settings. Our code is available at https://github.com/YujiaBao/Tofu.
Related papers
- Task-recency bias strikes back: Adapting covariances in Exemplar-Free Class Incremental Learning [0.3281128493853064]
We tackle the problem of training a model on a sequence of tasks without access to past data.
Existing methods represent classes as Gaussian distributions in the feature extractor's latent space.
We propose AdaGauss -- a novel method that adapts covariance matrices from task to task.
arXiv Detail & Related papers (2024-09-26T20:18:14Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - Pixel is All You Need: Adversarial Trajectory-Ensemble Active Learning
for Salient Object Detection [40.97103355628434]
It is unclear whether a saliency model trained with weakly-supervised data can achieve the equivalent performance of its fully-supervised version.
We propose a novel yet effective adversarial trajectory-ensemble active learning (ATAL)
Experimental results show that our ATAL can find such a point-labeled dataset, where a saliency model trained on it obtained $97%$ -- $99%$ performance of its fully-supervised version with only ten annotated points per image.
arXiv Detail & Related papers (2022-12-13T11:18:08Z) - Transferability Estimation Based On Principal Gradient Expectation [68.97403769157117]
Cross-task transferability is compatible with transferred results while keeping self-consistency.
Existing transferability metrics are estimated on the particular model by conversing source and target tasks.
We propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks.
arXiv Detail & Related papers (2022-11-29T15:33:02Z) - Alleviating the Sample Selection Bias in Few-shot Learning by Removing
Projection to the Centroid [22.918659185060523]
Task Centroid Projection Removing (TCPR) is applied directly to all image features in a given task.
Our method effectively prevents features from being too close to the task centroid.
It can reliably improve classification accuracy across various feature extractors, training algorithms and datasets.
arXiv Detail & Related papers (2022-10-30T13:03:13Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Unsupervised Robust Domain Adaptation without Source Data [75.85602424699447]
We study the problem of robust domain adaptation in the context of unavailable target labels and source data.
We show a consistent performance improvement of over $10%$ in accuracy against the tested baselines on four benchmark datasets.
arXiv Detail & Related papers (2021-03-26T16:42:28Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.