Semi-supervised Batch Active Learning via Bilevel Optimization
- URL: http://arxiv.org/abs/2010.09654v1
- Date: Mon, 19 Oct 2020 16:53:24 GMT
- Title: Semi-supervised Batch Active Learning via Bilevel Optimization
- Authors: Zal\'an Borsos, Marco Tagliasacchi, Andreas Krause
- Abstract summary: We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
- Score: 89.37476066973336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning is an effective technique for reducing the labeling cost by
improving data efficiency. In this work, we propose a novel batch acquisition
strategy for active learning in the setting where the model training is
performed in a semi-supervised manner. We formulate our approach as a data
summarization problem via bilevel optimization, where the queried batch
consists of the points that best summarize the unlabeled data pool. We show
that our method is highly effective in keyword detection tasks in the regime
when only few labeled samples are available.
Related papers
- Language Model-Driven Data Pruning Enables Efficient Active Learning [6.816044132563518]
We introduce a plug-and-play unlabeled data pruning strategy, ActivePrune, to prune the unlabeled pool.
To enhance the diversity in the unlabeled pool, we propose a novel perplexity reweighting method.
Experiments on translation, sentiment analysis, topic classification, and summarization tasks demonstrate that ActivePrune outperforms existing data pruning methods.
arXiv Detail & Related papers (2024-10-05T19:46:11Z) - Compute-Efficient Active Learning [0.0]
Active learning aims at reducing labeling costs by selecting the most informative samples from an unlabeled dataset.
Traditional active learning process often demands extensive computational resources, hindering scalability and efficiency.
We present a novel method designed to alleviate the computational burden associated with active learning on massive datasets.
arXiv Detail & Related papers (2024-01-15T12:32:07Z) - BAL: Balancing Diversity and Novelty for Active Learning [53.289700543331925]
We introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data.
Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%.
arXiv Detail & Related papers (2023-12-26T08:14:46Z) - Learning to Rank for Active Learning via Multi-Task Bilevel Optimization [29.207101107965563]
We propose a novel approach for active learning, which aims to select batches of unlabeled instances through a learned surrogate model for data acquisition.
A key challenge in this approach is developing an acquisition function that generalizes well, as the history of data, which forms part of the utility function's input, grows over time.
arXiv Detail & Related papers (2023-10-25T22:50:09Z) - DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection [72.25697820290502]
This work introduces a straightforward and efficient strategy to identify potential novel classes through zero-shot classification.
We refer to this approach as the self-training strategy, which enhances recall and accuracy for novel classes without requiring extra annotations, datasets, and re-training.
Empirical evaluations on three datasets, including LVIS, V3Det, and COCO, demonstrate significant improvements over the baseline performance.
arXiv Detail & Related papers (2023-10-02T17:52:24Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - An Efficient Active Learning Pipeline for Legal Text Classification [2.462514989381979]
We propose a pipeline for effectively using active learning with pre-trained language models in the legal domain.
We use knowledge distillation to guide the model's embeddings to a semantically meaningful space.
Our experiments on Contract-NLI, adapted to the classification task, and LEDGAR benchmarks show that our approach outperforms standard AL strategies.
arXiv Detail & Related papers (2022-11-15T13:07:02Z) - An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning [58.59343434538218]
We propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective.
Our approach can be implemented in just few lines of code by only using off-the-shelf operations.
arXiv Detail & Related papers (2022-09-28T02:11:34Z) - Improving Robustness and Efficiency in Active Learning with Contrastive
Loss [13.994967246046008]
This paper introduces supervised contrastive active learning (SCAL) by leveraging the contrastive loss for active learning in a supervised setting.
We propose efficient query strategies in active learning to select unbiased and informative data samples of diverse feature representations.
arXiv Detail & Related papers (2021-09-13T21:09:21Z) - Boosting Weakly Supervised Object Detection via Learning Bounding Box
Adjusters [76.36104006511684]
Weakly-supervised object detection (WSOD) has emerged as an inspiring recent topic to avoid expensive instance-level object annotations.
We defend the problem setting for improving localization performance by leveraging the bounding box regression knowledge from a well-annotated auxiliary dataset.
Our method performs favorably against state-of-the-art WSOD methods and knowledge transfer model with similar problem setting.
arXiv Detail & Related papers (2021-08-03T13:38:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.