Active Learning with Combinatorial Coverage
- URL: http://arxiv.org/abs/2302.14567v1
- Date: Tue, 28 Feb 2023 13:43:23 GMT
- Title: Active Learning with Combinatorial Coverage
- Authors: Sai Prathyush Katragadda, Tyler Cody, Peter Beling, Laura Freeman
- Abstract summary: Active learning is a practical field of machine learning that automates the process of selecting which data to label.
Current methods are effective in reducing the burden of data labeling but are heavily model-reliant.
This has led to the inability of sampled data to be transferred to new models as well as issues with sampling bias.
We propose active learning methods utilizing coverage to overcome these issues.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active learning is a practical field of machine learning that automates the
process of selecting which data to label. Current methods are effective in
reducing the burden of data labeling but are heavily model-reliant. This has
led to the inability of sampled data to be transferred to new models as well as
issues with sampling bias. Both issues are of crucial concern in machine
learning deployment. We propose active learning methods utilizing combinatorial
coverage to overcome these issues. The proposed methods are data-centric, as
opposed to model-centric, and through our experiments we show that the
inclusion of coverage in active learning leads to sampling data that tends to
be the best in transferring to better performing models and has a competitive
sampling bias compared to benchmark methods.
Related papers
- Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability.
We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side.
During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Minority Class Oriented Active Learning for Imbalanced Datasets [6.009262446889319]
We introduce a new active learning method which is designed for imbalanced datasets.
It favors samples likely to be in minority classes so as to reduce the imbalance of the labeled subset.
We also compare two training schemes for active learning.
arXiv Detail & Related papers (2022-02-01T13:13:41Z) - Practical Active Learning with Model Selection for Small Data [13.128648437690224]
We develop a simple and fast method for practical active learning with model selection.
Our method is based on an underlying pool-based active learner for binary classification using support vector classification with a radial basis function kernel.
arXiv Detail & Related papers (2021-12-21T23:11:27Z) - Mitigating Sampling Bias and Improving Robustness in Active Learning [13.994967246046008]
We introduce supervised contrastive active learning by leveraging the contrastive loss for active learning under a supervised setting.
We propose an unbiased query strategy that selects informative data samples of diverse feature representations.
We empirically demonstrate our proposed methods reduce sampling bias, achieve state-of-the-art accuracy and model calibration in an active learning setup.
arXiv Detail & Related papers (2021-09-13T20:58:40Z) - Mind Your Outliers! Investigating the Negative Impact of Outliers on
Active Learning for Visual Question Answering [71.15403434929915]
We show that across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection.
We identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn.
We show that active learning sample efficiency increases significantly as the number of collective outliers in the active learning pool decreases.
arXiv Detail & Related papers (2021-07-06T00:52:11Z) - Can Active Learning Preemptively Mitigate Fairness Issues? [66.84854430781097]
dataset bias is one of the prevailing causes of unfairness in machine learning.
We study whether models trained with uncertainty-based ALs are fairer in their decisions with respect to a protected class.
We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD.
arXiv Detail & Related papers (2021-04-14T14:20:22Z) - Message Passing Adaptive Resonance Theory for Online Active
Semi-supervised Learning [30.19936050747407]
We propose Message Passing Adaptive Resonance Theory (MPART) for online active semi-supervised learning.
MPART infers the class of unlabeled data and selects informative and representative samples through message passing between nodes on the topological graph.
We evaluate our model with comparable query selection strategies and frequencies, showing that MPART significantly outperforms the competitive models in online active learning environments.
arXiv Detail & Related papers (2020-12-02T14:14:42Z) - On the Robustness of Active Learning [0.7340017786387767]
Active Learning is concerned with how to identify the most useful samples for a Machine Learning algorithm to be trained with.
We find that it is often applied with not enough care and domain knowledge.
We propose the new "Sum of Squared Logits" method based on the Simpson diversity index and investigate the effect of using the confusion matrix for balancing in sample selection.
arXiv Detail & Related papers (2020-06-18T09:07:23Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.