CAMELL: Confidence-based Acquisition Model for Efficient Self-supervised
Active Learning with Label Validation
- URL: http://arxiv.org/abs/2310.08944v1
- Date: Fri, 13 Oct 2023 08:19:31 GMT
- Title: CAMELL: Confidence-based Acquisition Model for Efficient Self-supervised
Active Learning with Label Validation
- Authors: Carel van Niekerk, Christian Geishauser, Michael Heck, Shutong Feng,
Hsien-chin Lin, Nurul Lubis, Benjamin Ruppik and Renato Vukovic and Milica
Ga\v{s}i\'c
- Abstract summary: Supervised neural approaches are hindered by their dependence on large, meticulously annotated datasets.
We present textbfCAMELL, a pool-based active learning framework tailored for sequential multi-output problems.
- Score: 6.918298428336528
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised neural approaches are hindered by their dependence on large,
meticulously annotated datasets, a requirement that is particularly cumbersome
for sequential tasks. The quality of annotations tends to deteriorate with the
transition from expert-based to crowd-sourced labelling. To address these
challenges, we present \textbf{CAMELL} (Confidence-based Acquisition Model for
Efficient self-supervised active Learning with Label validation), a pool-based
active learning framework tailored for sequential multi-output problems. CAMELL
possesses three core features: (1) it requires expert annotators to label only
a fraction of a chosen sequence, (2) it facilitates self-supervision for the
remainder of the sequence, and (3) it employs a label validation mechanism to
prevent erroneous labels from contaminating the dataset and harming model
performance. We evaluate CAMELL on sequential tasks, with a special emphasis on
dialogue belief tracking, a task plagued by the constraints of limited and
noisy datasets. Our experiments demonstrate that CAMELL outperforms the
baselines in terms of efficiency. Furthermore, the data corrections suggested
by our method contribute to an overall improvement in the quality of the
resulting datasets.
Related papers
- TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection [59.498894868956306]
Pseudo-labeling approaches to semi-supervised learning adopt a teacher-student framework.
We leverage pre-trained motion-forecasting models to generate object trajectories on pseudo-labeled data.
Our approach improves pseudo-label quality in two distinct manners.
arXiv Detail & Related papers (2024-09-17T05:35:00Z) - Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory Annotations [0.0]
We introduce the notion of "label convergence" to describe the highest achievable performance under the constraint of contradictory test annotations.
We approximate that label convergence is between 62.63-67.52 mAP@[0.5:0.95:0.05] for LVIS with 95% confidence, attributing these bounds to the presence of real annotation errors.
With current state-of-the-art (SOTA) models at the upper end of the label convergence interval for the well-studied LVIS dataset, we conclude that model capacity is sufficient to solve current object detection problems.
arXiv Detail & Related papers (2024-09-14T10:59:25Z) - ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Active Self-Semi-Supervised Learning for Few Labeled Samples [4.713652957384158]
Training deep models with limited annotations poses a significant challenge when applied to diverse practical domains.
We propose a simple yet effective framework, active self-semi-supervised learning (AS3L)
AS3L bootstraps semi-supervised models with prior pseudo-labels (PPL)
We develop active learning and label propagation strategies to obtain accurate PPL.
arXiv Detail & Related papers (2022-03-09T07:45:05Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Active and Incremental Learning with Weak Supervision [7.2288756536476635]
In this work, we describe combinations of an incremental learning scheme and methods of active learning.
An object detection task is evaluated in a continuous exploration context on the PASCAL VOC dataset.
We also validate a weakly supervised system based on active and incremental learning in a real-world biodiversity application.
arXiv Detail & Related papers (2020-01-20T13:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.