Deep Active Learning for Biased Datasets via Fisher Kernel
Self-Supervision
- URL: http://arxiv.org/abs/2003.00393v1
- Date: Sun, 1 Mar 2020 03:56:32 GMT
- Title: Deep Active Learning for Biased Datasets via Fisher Kernel
Self-Supervision
- Authors: Denis Gudovskiy, Alec Hodgkinson, Takuya Yamaguchi, Sotaro Tsukizawa
- Abstract summary: Active learning (AL) aims to minimize labeling efforts for data-demanding deep neural networks (DNNs)
We propose a low-complexity method for feature density matching using self-supervised Fisher kernel (FK)
Our method outperforms state-of-the-art methods on MNIST, SVHN, and ImageNet classification while requiring only 1/10th of processing.
- Score: 5.352699766206807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning (AL) aims to minimize labeling efforts for data-demanding
deep neural networks (DNNs) by selecting the most representative data points
for annotation. However, currently used methods are ill-equipped to deal with
biased data. The main motivation of this paper is to consider a realistic
setting for pool-based semi-supervised AL, where the unlabeled collection of
train data is biased. We theoretically derive an optimal acquisition function
for AL in this setting. It can be formulated as distribution shift minimization
between unlabeled train data and weakly-labeled validation dataset. To
implement such acquisition function, we propose a low-complexity method for
feature density matching using self-supervised Fisher kernel (FK) as well as
several novel pseudo-label estimators. Our FK-based method outperforms
state-of-the-art methods on MNIST, SVHN, and ImageNet classification while
requiring only 1/10th of processing. The conducted experiments show at least
40% drop in labeling efforts for the biased class-imbalanced data compared to
existing methods.
Related papers
- Deep Active Learning with Manifold-preserving Trajectory Sampling [2.0717982775472206]
Active learning (AL) is for optimizing the selection of unlabeled data for annotation (labeling)
Existing deep AL methods arguably suffer from bias incurred by clabeled data, which takes a much lower percentage than unlabeled data in AL context.
We propose a novel method, namely Manifold-Preserving Trajectory Sampling (MPTS), aiming to enforce the feature space learned from labeled data to represent a more accurate manifold.
arXiv Detail & Related papers (2024-10-21T03:04:09Z) - Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Group Distributionally Robust Dataset Distillation with Risk
Minimization [18.07189444450016]
We introduce an algorithm that combines clustering with the minimization of a risk measure on the loss to conduct DD.
We demonstrate its effective generalization and robustness across subgroups through numerical experiments.
arXiv Detail & Related papers (2024-02-07T09:03:04Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Consistent Relative Confidence and Label-Free Model Selection for
Convolutional Neural Networks [4.497097230665825]
This paper presents an approach to CNN model selection using only unlabeled data.
The effectiveness and efficiency of the presented method are demonstrated by extensive experimental studies based on datasets MNIST and FasionMNIST.
arXiv Detail & Related papers (2021-08-26T15:14:38Z) - Active Learning under Pool Set Distribution Shift and Noisy Data [41.69385715445311]
We show that BALD gets stuck on out-of-distribution or junk data that is not relevant for the task.
We examine a novel *Expected Predictive Information Gain (EPIG)* to deal with distribution shifts of the pool set.
EPIG reduces the uncertainty of *predictions* on an unlabelled *evaluation set* sampled from the test data distribution whose distribution might be different to the pool set distribution.
arXiv Detail & Related papers (2021-06-22T12:39:30Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Learning to Count in the Crowd from Limited Labeled Data [109.2954525909007]
We focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples.
Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data.
arXiv Detail & Related papers (2020-07-07T04:17:01Z) - Regularization via Structural Label Smoothing [22.74769739125912]
Regularization is an effective way to promote the generalization performance of machine learning models.
In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network.
We show that such label smoothing imposes a quantifiable bias in the Bayes error rate of the training data.
arXiv Detail & Related papers (2020-01-07T05:45:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.