Related papers: Algorithm Selection for Deep Active Learning with Imbalanced Datasets

Algorithm Selection for Deep Active Learning with Imbalanced Datasets

URL: http://arxiv.org/abs/2302.07317v3
Date: Thu, 2 Nov 2023 20:55:23 GMT
Title: Algorithm Selection for Deep Active Learning with Imbalanced Datasets
Authors: Jifan Zhang, Shuai Shao, Saurabh Verma, Robert Nowak
Abstract summary: Active learning aims to reduce the number of labeled examples needed to train deep networks. It is difficult to know in advance which active learning strategy will perform well or best in a given application. We propose the first adaptive algorithm selection strategy for deep active learning.
Score: 11.902019233549474
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Label efficiency has become an increasingly important objective in deep learning applications. Active learning aims to reduce the number of labeled examples needed to train deep networks, but the empirical performance of active learning algorithms can vary dramatically across datasets and applications. It is difficult to know in advance which active learning strategy will perform well or best in a given application. To address this, we propose the first adaptive algorithm selection strategy for deep active learning. For any unlabeled dataset, our (meta) algorithm TAILOR (Thompson ActIve Learning algORithm selection) iteratively and adaptively chooses among a set of candidate active learning algorithms. TAILOR uses novel reward functions aimed at gathering class-balanced examples. Extensive experiments in multi-class and multi-label applications demonstrate TAILOR's effectiveness in achieving accuracy comparable or better than that of the best of the candidate algorithms. Our implementation of TAILOR is open-sourced at https://github.com/jifanz/TAILOR.

Related papers

AutoAL: Automated Active Learning with Differentiable Query Strategy Search [18.23964720426325]
This work presents the first differentiable active learning strategy search method, named AutoAL. For any given task, SearchNet and FitNet are iteratively co-optimized using the labeled data, learning how well a set of candidate AL algorithms perform on that task. AutoAL consistently achieves superior accuracy compared to all candidate AL algorithms and other selective AL approaches.
arXiv Detail & Related papers (2024-10-17T17:59:09Z)
Learning from the Best: Active Learning for Wireless Communications [9.523381807291049]
Active learning algorithms identify the most critical and informative samples in an unlabeled dataset and label only those samples, instead of the complete set. We present a case study of deep learning-based mmWave beam selection, where labeling is performed by a compute-intensive algorithm based on exhaustive search. Our results show that using an active learning algorithm for class-imbalanced datasets can reduce labeling overhead by up to 50% for this dataset.
arXiv Detail & Related papers (2024-01-23T12:21:57Z)
BAL: Balancing Diversity and Novelty for Active Learning [53.289700543331925]
We introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data. Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%.
arXiv Detail & Related papers (2023-12-26T08:14:46Z)
ALBench: A Framework for Evaluating Active Learning in Object Detection [102.81795062493536]
This paper contributes an active learning benchmark framework named as ALBench for evaluating active learning in object detection. Developed on an automatic deep model training system, this ALBench framework is easy-to-use, compatible with different active learning algorithms, and ensures the same training and testing protocols.
arXiv Detail & Related papers (2022-07-27T07:46:23Z)
Exploiting Diversity of Unlabeled Data for Label-Efficient Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling. We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting. Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z)
Meta Navigator: Search for a Good Adaptation Policy for Few-shot Learning [113.05118113697111]
Few-shot learning aims to adapt knowledge learned from previous tasks to novel tasks with only a limited amount of labeled data. Research literature on few-shot learning exhibits great diversity, while different algorithms often excel at different few-shot learning scenarios. We present Meta Navigator, a framework that attempts to solve the limitation in few-shot learning by seeking a higher-level strategy.
arXiv Detail & Related papers (2021-09-13T07:20:01Z)
Probabilistic Active Learning for Active Class Selection [3.6471065658293043]
In machine learning, active class selection (ACS) algorithms aim to actively select a class and ask the oracle to provide an instance for that class. We propose a new algorithm (PAL-ACS) that transforms the ACS problem into an active learning task by introducing pseudo instances.
arXiv Detail & Related papers (2021-08-09T09:20:19Z)
Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms [19.65665942630067]
Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process. There is little study on what the optimal AL looks like, which would help researchers understand where their models fall short. We present a simulated annealing algorithm to search for this optimal oracle and analyze it for several tasks.
arXiv Detail & Related papers (2020-12-29T22:56:42Z)
Rebuilding Trust in Active Learning with Actionable Metrics [77.99796068970569]
Active Learning (AL) is an active domain of research, but is seldom used in the industry despite the pressing needs. This is in part due to a misalignment of objectives, while research strives at getting the best results on selected datasets. We present various actionable metrics to help rebuild trust of industrial practitioners in Active Learning.
arXiv Detail & Related papers (2020-12-18T09:34:59Z)
Learning active learning at the crossroads? evaluation and discussion [0.03807314298073299]
Active learning aims to reduce annotation cost by predicting which samples are useful for a human expert to label. There is no best active learning strategy that consistently outperforms all others in all applications. We present the results of a benchmark performed on 20 datasets that compares a strategy learned using a recent meta-learning algorithm with margin sampling.
arXiv Detail & Related papers (2020-12-16T10:35:43Z)
Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization. We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.