Bayesian active learning for production, a systematic study and a
reusable library
- URL: http://arxiv.org/abs/2006.09916v1
- Date: Wed, 17 Jun 2020 14:51:11 GMT
- Title: Bayesian active learning for production, a systematic study and a
reusable library
- Authors: Parmida Atighehchian, Fr\'ed\'eric Branchaud-Charron, Alexandre
Lacoste
- Abstract summary: In this paper, we analyse the main drawbacks of current active learning techniques.
We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process.
We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
- Score: 85.32971950095742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning is able to reduce the amount of labelling effort by using a
machine learning model to query the user for specific inputs.
While there are many papers on new active learning techniques, these
techniques rarely satisfy the constraints of a real-world project. In this
paper, we analyse the main drawbacks of current active learning techniques and
we present approaches to alleviate them. We do a systematic study on the
effects of the most common issues of real-world datasets on the deep active
learning process: model convergence, annotation error, and dataset imbalance.
We derive two techniques that can speed up the active learning loop such as
partial uncertainty sampling and larger query size. Finally, we present our
open-source Bayesian active learning library, BaaL.
Related papers
- regAL: Python Package for Active Learning of Regression Problems [0.0]
Python package regAL allows users to evaluate different active learning strategies for regression problems.
We present our Python package regAL, which allows users to evaluate different active learning strategies for regression problems.
arXiv Detail & Related papers (2024-10-23T14:34:36Z) - Compute-Efficient Active Learning [0.0]
Active learning aims at reducing labeling costs by selecting the most informative samples from an unlabeled dataset.
Traditional active learning process often demands extensive computational resources, hindering scalability and efficiency.
We present a novel method designed to alleviate the computational burden associated with active learning on massive datasets.
arXiv Detail & Related papers (2024-01-15T12:32:07Z) - Model Uncertainty based Active Learning on Tabular Data using Boosted
Trees [0.4667030429896303]
Supervised machine learning relies on the availability of good labelled data for model training.
Active learning is a sub-field of machine learning which helps in obtaining the labelled data efficiently.
arXiv Detail & Related papers (2023-10-30T14:29:53Z) - Deep Active Learning for Computer Vision: Past and Future [50.19394935978135]
Despite its indispensable role for developing AI models, research on active learning is not as intensive as other research directions.
By addressing data automation challenges and coping with automated machine learning systems, active learning will facilitate democratization of AI technologies.
arXiv Detail & Related papers (2022-11-27T13:07:14Z) - Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability.
We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side.
During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - Mind Your Outliers! Investigating the Negative Impact of Outliers on
Active Learning for Visual Question Answering [71.15403434929915]
We show that across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection.
We identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn.
We show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases.
arXiv Detail & Related papers (2021-07-06T00:52:11Z) - Efficacy of Bayesian Neural Networks in Active Learning [11.609770399591516]
We show that Bayesian neural networks are more efficient than ensemble based techniques in capturing uncertainty.
Our findings also reveal some key drawbacks of the ensemble techniques, which was recently shown to be more effective than Monte Carlo dropouts.
arXiv Detail & Related papers (2021-04-02T06:02:11Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z) - Deep Bayesian Active Learning, A Brief Survey on Recent Advances [6.345523830122166]
Active learning starts training the model with a small size of labeled data.
Deep learning methods are not capable of either representing or manipulating model uncertainty.
Deep Bayesian active learning frameworks provide practical consideration in the model.
arXiv Detail & Related papers (2020-12-15T02:06:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.