CAMELL: Confidence-based Acquisition Model for Efficient Self-supervised
  Active Learning with Label Validation
        - URL: http://arxiv.org/abs/2310.08944v1
- Date: Fri, 13 Oct 2023 08:19:31 GMT
- Title: CAMELL: Confidence-based Acquisition Model for Efficient Self-supervised
  Active Learning with Label Validation
- Authors: Carel van Niekerk, Christian Geishauser, Michael Heck, Shutong Feng,
  Hsien-chin Lin, Nurul Lubis, Benjamin Ruppik and Renato Vukovic and Milica
  Ga\v{s}i\'c
- Abstract summary: Supervised neural approaches are hindered by their dependence on large, meticulously annotated datasets.
We present textbfCAMELL, a pool-based active learning framework tailored for sequential multi-output problems.
- Score: 6.918298428336528
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   Supervised neural approaches are hindered by their dependence on large,
meticulously annotated datasets, a requirement that is particularly cumbersome
for sequential tasks. The quality of annotations tends to deteriorate with the
transition from expert-based to crowd-sourced labelling. To address these
challenges, we present \textbf{CAMELL} (Confidence-based Acquisition Model for
Efficient self-supervised active Learning with Label validation), a pool-based
active learning framework tailored for sequential multi-output problems. CAMELL
possesses three core features: (1) it requires expert annotators to label only
a fraction of a chosen sequence, (2) it facilitates self-supervision for the
remainder of the sequence, and (3) it employs a label validation mechanism to
prevent erroneous labels from contaminating the dataset and harming model
performance. We evaluate CAMELL on sequential tasks, with a special emphasis on
dialogue belief tracking, a task plagued by the constraints of limited and
noisy datasets. Our experiments demonstrate that CAMELL outperforms the
baselines in terms of efficiency. Furthermore, the data corrections suggested
by our method contribute to an overall improvement in the quality of the
resulting datasets.
 
      
        Related papers
        - Stochastic Encodings for Active Feature Acquisition [100.47043816019888]
 Active Feature Acquisition is an instance-wise, sequential decision making problem.<n>The aim is to dynamically select which feature to measure based on current observations, independently for each test instance.<n>Common approaches either use Reinforcement Learning, which experiences training difficulties, or greedily maximize the conditional mutual information of the label and unobserved features, which makes myopic.<n>We introduce a latent variable model, trained in a supervised manner. Acquisitions are made by reasoning about the features across many possible unobserved realizations in a latent space.
 arXiv  Detail & Related papers  (2025-08-03T23:48:46Z)
- Segment Concealed Objects with Incomplete Supervision [63.637733655439334]
 Incompletely-Supervised Concealed Object (ISCOS) involves segmenting objects that seamlessly blend into their surrounding environments.<n>This task remains highly challenging due to the limited supervision provided by the incompletely annotated training data.<n>In this paper, we introduce the first unified method for ISCOS to address these challenges.
 arXiv  Detail & Related papers  (2025-06-10T16:25:15Z)
- Feedback-Driven Pseudo-Label Reliability Assessment: Redefining   Thresholding for Semi-Supervised Semantic Segmentation [5.7977777220041204]
 A common practice in pseudo-supervision is filtering pseudo-labels based on pre-defined confidence thresholds or entropy.<n>We propose Ensemble-of-Confidence Reinforcement (ENCORE), a dynamic feedback-driven thresholding strategy for pseudo-label selection.<n>Our method seamlessly integrates into existing pseudo-supervision frameworks and significantly improves segmentation performance.
 arXiv  Detail & Related papers  (2025-05-12T15:58:08Z)
- Privacy-Preserving Model and Preprocessing Verification for Machine   Learning [9.4033740844828]
 This paper presents a framework for privacy-preserving verification of machine learning models, focusing on models trained on sensitive data.
It addresses two key tasks: binary classification, to verify if a target model was trained correctly by applying the appropriate preprocessing steps, and multi-class classification, to identify specific preprocessing errors.
Results indicate that although verification accuracy varies across datasets and noise levels, the framework provides effective detection of preprocessing errors, strong privacy guarantees, and practical applicability for safeguarding sensitive data.
 arXiv  Detail & Related papers  (2025-01-14T16:21:54Z)
- Neural Machine Unranking [3.2340528215722553]
 We introduce a novel task termed Neural Machine UnRanking (NuMuR)<n>Existing task- or model- agnostic unlearning approaches are suboptimal for NuMuR due to two core challenges.<n>CoCoL comprises (1) a contrastive loss that reduces relevance scores on forget sets while maintaining performance on entangled samples, and (2) a consistent loss that preserves accuracy on retain set.
 arXiv  Detail & Related papers  (2024-08-09T20:36:40Z)
- Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad   Prediction [54.23208041792073]
 Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
 arXiv  Detail & Related papers  (2024-06-26T05:30:21Z)
- Incremental Self-training for Semi-supervised Learning [56.57057576885672]
 IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
 arXiv  Detail & Related papers  (2024-04-14T05:02:00Z)
- Evaluating Generative Language Models in Information Extraction as   Subjective Question Correction [49.729908337372436]
 We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
 arXiv  Detail & Related papers  (2024-04-04T15:36:53Z)
- Active Label Correction for Semantic Segmentation with Foundation Models [34.0733215363568]
 We propose an effective framework of active label correction (ALC) based on a design of correction query to rectify pseudo labels of pixels.
Our method comprises two key techniques: (i) an annotator-friendly design of correction query with the pseudo labels, and (ii) an acquisition function looking ahead label expansions based on the superpixels.
 Experimental results on PASCAL, Cityscapes, and Kvasir-SEG datasets demonstrate the effectiveness of our ALC framework.
 arXiv  Detail & Related papers  (2024-03-16T06:10:22Z)
- DUEL: Duplicate Elimination on Active Memory for Self-Supervised
  Class-Imbalanced Learning [19.717868805172323]
 We propose an active data filtering process during self-supervised pre-training in our novel framework, Duplicate Elimination (DUEL)
This framework integrates an active memory inspired by human working memory and introduces distinctiveness information, which measures the diversity of the data in the memory.
The DUEL policy, which replaces the most duplicated data with new samples, aims to enhance the distinctiveness information in the memory and thereby mitigate class imbalances.
 arXiv  Detail & Related papers  (2024-02-14T06:09:36Z)
- Uncertainty-aware Self-training for Low-resource Neural Sequence
  Labeling [29.744621356187764]
 This paper presents SeqUST, a novel uncertain-aware self-training framework for Neural sequence labeling (NSL)
We incorporate Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation at the token level and then select reliable language tokens from unlabeled data.
A well-designed masked sequence labeling task with a noise-robust loss supports robust training, which aims to suppress the problem of noisy pseudo labels.
 arXiv  Detail & Related papers  (2023-02-17T02:40:04Z)
- Adversarial Dual-Student with Differentiable Spatial Warping for
  Semi-Supervised Semantic Segmentation [70.2166826794421]
 We propose a differentiable geometric warping to conduct unsupervised data augmentation.
We also propose a novel adversarial dual-student framework to improve the Mean-Teacher.
Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
 arXiv  Detail & Related papers  (2022-03-05T17:36:17Z)
- WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
 We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
 arXiv  Detail & Related papers  (2021-05-21T11:58:50Z)
- Social Adaptive Module for Weakly-supervised Group Activity Recognition [143.68241396839062]
 This paper presents a new task named weakly-supervised group activity recognition (GAR)
It differs from conventional GAR tasks in that only video-level labels are available, yet the important persons within each frame are not provided even in the training data.
This eases us to collect and annotate a large-scale NBA dataset and thus raise new challenges to GAR.
 arXiv  Detail & Related papers  (2020-07-18T16:40:55Z)
- Active and Incremental Learning with Weak Supervision [7.2288756536476635]
 In this work, we describe combinations of an incremental learning scheme and methods of active learning.
An object detection task is evaluated in a continuous exploration context on the PASCAL VOC dataset.
We also validate a weakly supervised system based on active and incremental learning in a real-world biodiversity application.
 arXiv  Detail & Related papers  (2020-01-20T13:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.