ST-CoNAL: Consistency-Based Acquisition Criterion Using Temporal
Self-Ensemble for Active Learning
- URL: http://arxiv.org/abs/2207.02182v1
- Date: Tue, 5 Jul 2022 17:25:59 GMT
- Title: ST-CoNAL: Consistency-Based Acquisition Criterion Using Temporal
Self-Ensemble for Active Learning
- Authors: Jae Soon Baik, In Young Yoon, Jun Won Choi
- Abstract summary: Active learning (AL) is becoming increasingly important to maximize the efficiency of the training process.
We present an AL algorithm, namely student-teacher consistency-based AL (ST-CoNAL)
Experiments conducted for image classification tasks on CIFAR-10, CIFAR-100, Caltech-256, and Tiny ImageNet datasets demonstrate that the proposed STCoNAL significantly better performance than the existing acquisition methods.
- Score: 7.94190631530826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern deep learning has achieved great success in various fields. However,
it requires the labeling of huge amounts of data, which is expensive and
labor-intensive. Active learning (AL), which identifies the most informative
samples to be labeled, is becoming increasingly important to maximize the
efficiency of the training process. The existing AL methods mostly use only a
single final fixed model for acquiring the samples to be labeled. This strategy
may not be good enough in that the structural uncertainty of a model for given
training data is not considered to acquire the samples. In this study, we
propose a novel acquisition criterion based on temporal self-ensemble generated
by conventional stochastic gradient descent (SGD) optimization. These
self-ensemble models are obtained by capturing the intermediate network weights
obtained through SGD iterations. Our acquisition function relies on a
consistency measure between the student and teacher models. The student models
are given a fixed number of temporal self-ensemble models, and the teacher
model is constructed by averaging the weights of the student models. Using the
proposed acquisition criterion, we present an AL algorithm, namely
student-teacher consistency-based AL (ST-CoNAL). Experiments conducted for
image classification tasks on CIFAR-10, CIFAR-100, Caltech-256, and Tiny
ImageNet datasets demonstrate that the proposed ST-CoNAL achieves significantly
better performance than the existing acquisition methods. Furthermore,
extensive experiments show the robustness and effectiveness of our methods.
Related papers
- LPLgrad: Optimizing Active Learning Through Gradient Norm Sample Selection and Auxiliary Model Training [2.762397703396293]
Loss Prediction Loss with Gradient Norm (LPLgrad) is designed to quantify model uncertainty effectively and improve the accuracy of image classification tasks.
LPLgrad operates in two distinct phases: (i) em Training Phase aims to predict the loss for input features by jointly training a main model and an auxiliary model.
This dual-model approach enhances the ability to extract complex input features and learn intrinsic patterns from the data effectively.
arXiv Detail & Related papers (2024-11-20T18:12:59Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - Effective Robustness against Natural Distribution Shifts for Models with
Different Training Data [113.21868839569]
"Effective robustness" measures the extra out-of-distribution robustness beyond what can be predicted from the in-distribution (ID) performance.
We propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data.
arXiv Detail & Related papers (2023-02-02T19:28:41Z) - Frugal Reinforcement-based Active Learning [12.18340575383456]
We propose a novel active learning approach for label-efficient training.
The proposed method is iterative and aims at minimizing a constrained objective function that mixes diversity, representativity and uncertainty criteria.
We also introduce a novel weighting mechanism based on reinforcement learning, which adaptively balances these criteria at each training iteration.
arXiv Detail & Related papers (2022-12-09T14:17:45Z) - Non-iterative optimization of pseudo-labeling thresholds for training
object detection models from multiple datasets [2.1485350418225244]
We propose a non-iterative method to optimize pseudo-labeling thresholds for learning object detection from a collection of low-cost datasets.
We experimentally demonstrate that our proposed method achieves an mAP comparable to that of grid search on the COCO and VOC datasets.
arXiv Detail & Related papers (2022-10-19T00:31:34Z) - Simplifying Model-based RL: Learning Representations, Latent-space
Models, and Policies with One Objective [142.36200080384145]
We propose a single objective which jointly optimize a latent-space model and policy to achieve high returns while remaining self-consistent.
We demonstrate that the resulting algorithm matches or improves the sample-efficiency of the best prior model-based and model-free RL methods.
arXiv Detail & Related papers (2022-09-18T03:51:58Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain,
Active and Continual Few-Shot Learning [41.07029317930986]
We propose a variance-sensitive class of models that operates in a low-label regime.
The first method, Simple CNAPS, employs a hierarchically regularized Mahalanobis-distance based classifier.
We further extend this approach to a transductive learning setting, proposing Transductive CNAPS.
arXiv Detail & Related papers (2022-01-13T18:59:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.