Active Learning with Expected Error Reduction
- URL: http://arxiv.org/abs/2211.09283v1
- Date: Thu, 17 Nov 2022 01:02:12 GMT
- Title: Active Learning with Expected Error Reduction
- Authors: Stephen Mussmann, Julia Reisler, Daniel Tsai, Ehsan Mousavi, Shayne
O'Brien, Moises Goldszmidt
- Abstract summary: Expected Error Reduction (EER) has been shown to be an effective method for active learning.
EER requires the model to be retrained for every candidate sample.
In this paper we reformulate EER under the lens of Bayesian active learning.
- Score: 4.506537904404427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning has been studied extensively as a method for efficient data
collection. Among the many approaches in literature, Expected Error Reduction
(EER) (Roy and McCallum) has been shown to be an effective method for active
learning: select the candidate sample that, in expectation, maximally decreases
the error on an unlabeled set. However, EER requires the model to be retrained
for every candidate sample and thus has not been widely used for modern deep
neural networks due to this large computational cost. In this paper we
reformulate EER under the lens of Bayesian active learning and derive a
computationally efficient version that can use any Bayesian parameter sampling
method (such as arXiv:1506.02142). We then compare the empirical performance of
our method using Monte Carlo dropout for parameter sampling against state of
the art methods in the deep active learning literature. Experiments are
performed on four standard benchmark datasets and three WILDS datasets
(arXiv:2012.07421). The results indicate that our method outperforms all other
methods except one in the data shift scenario: a model dependent,
non-information theoretic method that requires an order of magnitude higher
computational cost (arXiv:1906.03671).
Related papers
- Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets.
Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly.
FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - EPEM: Efficient Parameter Estimation for Multiple Class Monotone Missing
Data [3.801859210248944]
We propose a novel algorithm to compute the maximum likelihood estimators (MLEs) of a multiple class, monotone missing dataset.
As the computation is exact, our EPEM algorithm does not require multiple iterations through the data as other imputation approaches.
arXiv Detail & Related papers (2020-09-23T20:07:53Z) - Active Learning for Gaussian Process Considering Uncertainties with
Application to Shape Control of Composite Fuselage [7.358477502214471]
We propose two new active learning algorithms for the Gaussian process with uncertainties.
We show that the proposed approach can incorporate the impact from uncertainties, and realize better prediction performance.
This approach has been applied to improving the predictive modeling for automatic shape control of composite fuselage.
arXiv Detail & Related papers (2020-04-23T02:04:53Z) - Monotonic Cardinality Estimation of Similarity Selection: A Deep
Learning Approach [22.958342743597044]
We investigate the possibilities of utilizing deep learning for cardinality estimation of similarity selection.
We propose a novel and generic method that can be applied to any data type and distance function.
arXiv Detail & Related papers (2020-02-15T20:22:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.