Benchmarking of Query Strategies: Towards Future Deep Active Learning
- URL: http://arxiv.org/abs/2312.05751v1
- Date: Sun, 10 Dec 2023 04:17:16 GMT
- Title: Benchmarking of Query Strategies: Towards Future Deep Active Learning
- Authors: Shiryu Ueno, Yusei Yamada, Shunsuke Nakatsuka, and Kunihito Kato
- Abstract summary: We benchmark query strategies for deep actice learning(DAL)
DAL reduces annotation costs by annotating only high-quality samples selected by query strategies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we benchmark query strategies for deep actice learning~(DAL).
DAL reduces annotation costs by annotating only high-quality samples selected
by query strategies. Existing research has two main problems, that the
experimental settings are not standardized, making the evaluation of existing
methods is difficult, and that most of experiments were conducted on the CIFAR
or MNIST datasets. Therefore, we develop standardized experimental settings for
DAL and investigate the effectiveness of various query strategies using six
datasets, including those that contain medical and visual inspection images. In
addition, since most current DAL approaches are model-based, we perform
verification experiments using fully-trained models for querying to investigate
the effectiveness of these approaches for the six datasets. Our code is
available at
\href{https://github.com/ia-gu/Benchmarking-of-Query-Strategies-Towards-Future-Deep-Active-Learning}
Related papers
- Realistic Evaluation of Test-Time Adaptation Algorithms: Unsupervised Hyperparameter Selection [1.4530711901349282]
Test-Time Adaptation (TTA) has emerged as a promising strategy for tackling the problem of machine learning model robustness under distribution shifts.
We evaluate existing TTA methods using surrogate-based hp-selection strategies to obtain a more realistic evaluation of their performance.
arXiv Detail & Related papers (2024-07-19T11:58:30Z) - Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Spanning Training Progress: Temporal Dual-Depth Scoring (TDDS) for Enhanced Dataset Pruning [50.809769498312434]
We propose a novel dataset pruning method termed as Temporal Dual-Depth Scoring (TDDS)
Our method achieves 54.51% accuracy with only 10% training data, surpassing random selection by 7.83% and other comparison methods by at least 12.69%.
arXiv Detail & Related papers (2023-11-22T03:45:30Z) - DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection [72.25697820290502]
This work introduces a straightforward and efficient strategy to identify potential novel classes through zero-shot classification.
We refer to this approach as the self-training strategy, which enhances recall and accuracy for novel classes without requiring extra annotations, datasets, and re-training.
Empirical evaluations on three datasets, including LVIS, V3Det, and COCO, demonstrate significant improvements over the baseline performance.
arXiv Detail & Related papers (2023-10-02T17:52:24Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - ALE: A Simulation-Based Active Learning Evaluation Framework for the
Parameter-Driven Comparison of Query Strategies for NLP [3.024761040393842]
Active Learning (AL) proposes promising data points to annotators they annotate next instead of a subsequent or random sample.
This method is supposed to save annotation effort while maintaining model performance.
We introduce a reproducible active learning evaluation framework for the comparative evaluation of AL strategies in NLP.
arXiv Detail & Related papers (2023-08-01T10:42:11Z) - ActiveGLAE: A Benchmark for Deep Active Learning with Transformers [5.326702806697265]
Deep active learning (DAL) seeks to reduce annotation costs by enabling the model to actively query instance annotations from which it expects to learn the most.
There is currently no standardized evaluation protocol for transformer-based language models in the field of DAL.
We propose the ActiveGLAE benchmark, a comprehensive collection of data sets and evaluation guidelines for assessing DAL.
arXiv Detail & Related papers (2023-06-16T13:07:29Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - Is margin all you need? An extensive empirical study of active learning
on tabular data [66.18464006872345]
We analyze the performance of a variety of active learning algorithms on 69 real-world datasets from the OpenML-CC18 benchmark.
Surprisingly, we find that the classical margin sampling technique matches or outperforms all others, including current state-of-art.
arXiv Detail & Related papers (2022-10-07T21:18:24Z) - A Comparative Survey of Deep Active Learning [76.04825433362709]
Active Learning (AL) is a set of techniques for reducing labeling cost by sequentially selecting data samples from a large unlabeled data pool for labeling.
Deep Learning (DL) is data-hungry, and the performance of DL models scales monotonically with more training data.
In recent years, Deep Active Learning (DAL) has risen as feasible solutions for maximizing model performance while minimizing the expensive labeling cost.
arXiv Detail & Related papers (2022-03-25T05:17:24Z) - A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise
Datasets [15.206465106699293]
Recent Offline Reinforcement Learning methods have succeeded in learning high-performance policies from fixed datasets of experience.
Our work evaluates this method's ability to scale to vast datasets consisting almost entirely of sub-optimal noise.
This modification enables offline agents to learn state-of-the-art policies in benchmark tasks using datasets where expert actions are outnumbered nearly 65:1.
arXiv Detail & Related papers (2021-10-10T03:55:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.