A Comparative Survey of Deep Active Learning
- URL: http://arxiv.org/abs/2203.13450v1
- Date: Fri, 25 Mar 2022 05:17:24 GMT
- Title: A Comparative Survey of Deep Active Learning
- Authors: Xueying Zhan, Qingzhong Wang, Kuan-hao Huang, Haoyi Xiong, Dejing Dou,
Antoni B. Chan
- Abstract summary: Active Learning (AL) is a set of techniques for reducing labeling cost by sequentially selecting data samples from a large unlabeled data pool for labeling.
Deep Learning (DL) is data-hungry, and the performance of DL models scales monotonically with more training data.
In recent years, Deep Active Learning (DAL) has risen as feasible solutions for maximizing model performance while minimizing the expensive labeling cost.
- Score: 76.04825433362709
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active Learning (AL) is a set of techniques for reducing labeling cost by
sequentially selecting data samples from a large unlabeled data pool for
labeling. Meanwhile, Deep Learning (DL) is data-hungry, and the performance of
DL models scales monotonically with more training data. Therefore, in recent
years, Deep Active Learning (DAL) has risen as feasible solutions for
maximizing model performance while minimizing the expensive labeling cost.
Abundant methods have sprung up and literature reviews of DAL have been
presented before. However, the performance comparison of different branches of
DAL methods under various tasks is still insufficient and our work fills this
gap. In this paper, we survey and categorize DAL-related work and construct
comparative experiments across frequently used datasets and DAL algorithms.
Additionally, we explore some factors (e.g., batch size, number of epochs in
the training process) that influence the efficacy of DAL, which provides better
references for researchers to design their own DAL experiments or carry out
DAL-related applications. We construct a DAL toolkit, DeepAL+, by
re-implementing many highly-cited DAL-related methods, and it will be released
to the public.
Related papers
- LLMaAA: Making Large Language Models as Active Annotators [32.57011151031332]
We propose LLMaAA, which takes large language models as annotators and puts them into an active learning loop to determine what to annotate efficiently.
We conduct experiments and analysis on two classic NLP tasks, named entity recognition and relation extraction.
With LLMaAA, task-specific models trained from LLM-generated labels can outperform the teacher within only hundreds of annotated examples.
arXiv Detail & Related papers (2023-10-30T14:54:15Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - An Empirical Study on the Efficacy of Deep Active Learning for Image
Classification [11.398892277968427]
Deep Active Learning (DAL) has been advocated as a promising method to reduce labeling costs in supervised learning.
Existing evaluations of DAL methods are based on different settings, and their results are controversial.
This paper comprehensively evaluates 19 existing DAL methods in a uniform setting.
arXiv Detail & Related papers (2022-11-30T17:44:59Z) - TiDAL: Learning Training Dynamics for Active Learning [10.832194534164142]
We propose Training Dynamics for Active Learning (TiDAL) to quantify uncertainties of unlabeled data.
Since tracking the TD of all the large-scale unlabeled data is impractical, TiDAL utilizes an additional prediction module that learns the TD of labeled data.
Our TiDAL achieves better or comparable performance on both balanced and imbalanced benchmark datasets compared to state-of-the-art AL methods.
arXiv Detail & Related papers (2022-10-13T06:54:50Z) - Pareto Optimization for Active Learning under Out-of-Distribution Data
Scenarios [79.02009938011447]
We propose a sampling scheme, which selects optimal subsets of unlabeled samples with fixed batch size from the unlabeled data pool.
Experimental results show its effectiveness on both classical Machine Learning (ML) and Deep Learning (DL) tasks.
arXiv Detail & Related papers (2022-07-04T04:11:44Z) - Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of
Semi-Supervised Learning and Active Learning [60.26659373318915]
Active learning (AL) and semi-supervised learning (SSL) are two effective, but often isolated, means to alleviate the data-hungry problem.
We propose an innovative Inconsistency-based virtual aDvErial algorithm to further investigate SSL-AL's potential superiority.
Two real-world case studies visualize the practical industrial value of applying and deploying the proposed data sampling algorithm.
arXiv Detail & Related papers (2022-06-07T13:28:43Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - ALdataset: a benchmark for pool-based active learning [1.9308522511657449]
Active learning (AL) is a subfield of machine learning (ML) in which a learning algorithm could achieve good accuracy with less training samples by interactively querying a user/oracle to label new data points.
Pool-based AL is well-motivated in many ML tasks, where unlabeled data is abundant, but their labels are hard to obtain.
We present experiment results for various active learning strategies, both recently proposed and classic highly-cited methods, and draw insights from the results.
arXiv Detail & Related papers (2020-10-16T04:37:29Z) - A Survey of Deep Active Learning [54.376820959917005]
Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples.
Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters.
Deep active learning (DAL) has emerged.
arXiv Detail & Related papers (2020-08-30T04:28:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.