Related papers: Bayesian active learning for production, a systematic study and a reusable library

Bayesian active learning for production, a systematic study and a reusable library

URL: http://arxiv.org/abs/2006.09916v1
Date: Wed, 17 Jun 2020 14:51:11 GMT
Title: Bayesian active learning for production, a systematic study and a reusable library
Authors: Parmida Atighehchian, Fr\'ed\'eric Branchaud-Charron, Alexandre Lacoste
Abstract summary: In this paper, we analyse the main drawbacks of current active learning techniques. We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process. We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
Score: 85.32971950095742
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Active learning is able to reduce the amount of labelling effort by using a machine learning model to query the user for specific inputs. While there are many papers on new active learning techniques, these techniques rarely satisfy the constraints of a real-world project. In this paper, we analyse the main drawbacks of current active learning techniques and we present approaches to alleviate them. We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process: model convergence, annotation error, and dataset imbalance. We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size. Finally, we present our open-source Bayesian active learning library, BaaL.

Related papers

Have LLMs Made Active Learning Obsolete? Surveying the NLP Community [7.99984266570379]
Supervised learning relies on annotated data, which is expensive to obtain. Large language models have pushed the effectiveness of active learning, but have also improved methods such as few- or zero-shot learning. This raises the question: has active learning become obsolete?
arXiv Detail & Related papers (2025-03-12T18:00:04Z)
regAL: Python Package for Active Learning of Regression Problems [0.0]
Python package regAL allows users to evaluate different active learning strategies for regression problems. We present our Python package regAL, which allows users to evaluate different active learning strategies for regression problems.
arXiv Detail & Related papers (2024-10-23T14:34:36Z)
Compute-Efficient Active Learning [0.0]
Active learning aims at reducing labeling costs by selecting the most informative samples from an unlabeled dataset. Traditional active learning process often demands extensive computational resources, hindering scalability and efficiency. We present a novel method designed to alleviate the computational burden associated with active learning on massive datasets.
arXiv Detail & Related papers (2024-01-15T12:32:07Z)
Model Uncertainty based Active Learning on Tabular Data using Boosted Trees [0.4667030429896303]
Supervised machine learning relies on the availability of good labelled data for model training. Active learning is a sub-field of machine learning which helps in obtaining the labelled data efficiently.
arXiv Detail & Related papers (2023-10-30T14:29:53Z)
Deep Active Learning for Computer Vision: Past and Future [50.19394935978135]
Despite its indispensable role for developing AI models, research on active learning is not as intensive as other research directions. By addressing data automation challenges and coping with automated machine learning systems, active learning will facilitate democratization of AI technologies.
arXiv Detail & Related papers (2022-11-27T13:07:14Z)
Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability. We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side. During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z)
What Makes Good Contrastive Learning on Small-Scale Wearable-based Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task. This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z)
Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering [71.15403434929915]
We show that across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection. We identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn. We show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases.
arXiv Detail & Related papers (2021-07-06T00:52:11Z)
Efficacy of Bayesian Neural Networks in Active Learning [11.609770399591516]
We show that Bayesian neural networks are more efficient than ensemble based techniques in capturing uncertainty. Our findings also reveal some key drawbacks of the ensemble techniques, which was recently shown to be more effective than Monte Carlo dropouts.
arXiv Detail & Related papers (2021-04-02T06:02:11Z)
Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget. We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework. We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z)
Deep Bayesian Active Learning, A Brief Survey on Recent Advances [6.345523830122166]
Active learning starts training the model with a small size of labeled data. Deep learning methods are not capable of either representing or manipulating model uncertainty. Deep Bayesian active learning frameworks provide practical consideration in the model.
arXiv Detail & Related papers (2020-12-15T02:06:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.