How to Allocate your Label Budget? Choosing between Active Learning and
Learning to Reject in Anomaly Detection
- URL: http://arxiv.org/abs/2301.02909v1
- Date: Sat, 7 Jan 2023 18:02:43 GMT
- Title: How to Allocate your Label Budget? Choosing between Active Learning and
Learning to Reject in Anomaly Detection
- Authors: Lorenzo Perini, Daniele Giannuzzi, Jesse Davis
- Abstract summary: Anomaly detection attempts at finding examples that deviate from the expected behaviour.
The lack of labels makes the anomaly detector have high uncertainty in some regions.
We propose a mixed strategy that decides in multiple rounds whether to collect AL labels or Learning to Reject labels.
- Score: 15.224212372777002
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anomaly detection attempts at finding examples that deviate from the expected
behaviour. Usually, anomaly detection is tackled from an unsupervised
perspective because anomalous labels are rare and difficult to acquire.
However, the lack of labels makes the anomaly detector have high uncertainty in
some regions, which usually results in poor predictive performance or low user
trust in the predictions. One can reduce such uncertainty by collecting
specific labels using Active Learning (AL), which targets examples close to the
detector's decision boundary. Alternatively, one can increase the user trust by
allowing the detector to abstain from making highly uncertain predictions,
which is called Learning to Reject (LR). One way to do this is by thresholding
the detector's uncertainty based on where its performance is low, which
requires labels to be evaluated. Although both AL and LR need labels, they work
with different types of labels: AL seeks strategic labels, which are evidently
biased, while LR requires i.i.d. labels to evaluate the detector's performance
and set the rejection threshold. Because one usually has a unique label budget,
deciding how to optimally allocate it is challenging. In this paper, we propose
a mixed strategy that, given a budget of labels, decides in multiple rounds
whether to use the budget to collect AL labels or LR labels. The strategy is
based on a reward function that measures the expected gain when allocating the
budget to either side. We evaluate our strategy on 18 benchmark datasets and
compare it to some baselines.
Related papers
- Unsupervised Learning of Distributional Properties can Supplement Human
Labeling and Increase Active Learning Efficiency in Anomaly Detection [0.0]
Exfiltration of data via email is a serious cybersecurity threat for many organizations.
Active Learning is a promising approach for labeling data efficiently.
We propose an adaptive AL sampling strategy to produce batches of cases to be labeled that contain instances of rare anomalies.
arXiv Detail & Related papers (2023-07-13T22:14:30Z) - Partial-Label Regression [54.74984751371617]
Partial-label learning is a weakly supervised learning setting that allows each training example to be annotated with a set of candidate labels.
Previous studies on partial-label learning only focused on the classification setting where candidate labels are all discrete.
In this paper, we provide the first attempt to investigate partial-label regression, where each training example is annotated with a set of real-valued candidate labels.
arXiv Detail & Related papers (2023-06-15T09:02:24Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Automated Detection of Label Errors in Semantic Segmentation Datasets
via Deep Learning and Uncertainty Quantification [5.076419064097734]
We for the first time present a method for detecting label errors in semantic segmentation datasets with pixel-wise labels.
Our approach is able to detect the vast majority of label errors while controlling the number of false label error detections.
arXiv Detail & Related papers (2022-07-13T10:25:23Z) - Learning with Proper Partial Labels [87.65718705642819]
Partial-label learning is a kind of weakly-supervised learning with inexact labels.
We show that this proper partial-label learning framework includes many previous partial-label learning settings.
We then derive a unified unbiased estimator of the classification risk.
arXiv Detail & Related papers (2021-12-23T01:37:03Z) - Learning with Noisy Labels by Targeted Relabeling [52.0329205268734]
Crowdsourcing platforms are often used to collect datasets for training deep neural networks.
We propose an approach which reserves a fraction of annotations to explicitly relabel highly probable labeling errors.
arXiv Detail & Related papers (2021-10-15T20:37:29Z) - Active WeaSuL: Improving Weak Supervision with Active Learning [2.624902795082451]
We propose Active WeaSuL: an approach that incorporates active learning into weak supervision.
We make two contributions: 1) a modification of the weak supervision loss function, such that the expert-labelled data inform and improve the combination of weak labels; and 2) the maxKL divergence sampling strategy, which determines for which data points expert labelling is most beneficial.
arXiv Detail & Related papers (2021-04-30T08:58:26Z) - Active Learning for Noisy Data Streams Using Weak and Strong Labelers [3.9370369973510746]
We consider a novel weak and strong labeler problem inspired by humans natural ability for labeling.
We propose an on-line active learning algorithm that consists of four steps: filtering, adding diversity, informative sample selection, and labeler selection.
We derive a decision function that measures the information gain by combining the informativeness of individual samples and model confidence.
arXiv Detail & Related papers (2020-10-27T09:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.