Related papers: Label Selection Approach to Learning from Crowds

Label Selection Approach to Learning from Crowds

URL: http://arxiv.org/abs/2308.10396v1
Date: Mon, 21 Aug 2023 00:22:32 GMT
Title: Label Selection Approach to Learning from Crowds
Authors: Kosuke Yoshimura and Hisashi Kashima
Abstract summary: Learning from Crowds is a framework which directly trains the models using noisy labeled data from crowd workers. We propose a novel Learning from Crowds model, inspired by SelectiveNet proposed for the selective prediction problem. A major advantage of the proposed method is that it can be applied to almost all variants of supervised learning problems.
Score: 25.894399244406287
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Supervised learning, especially supervised deep learning, requires large amounts of labeled data. One approach to collect large amounts of labeled data is by using a crowdsourcing platform where numerous workers perform the annotation tasks. However, the annotation results often contain label noise, as the annotation skills vary depending on the crowd workers and their ability to complete the task correctly. Learning from Crowds is a framework which directly trains the models using noisy labeled data from crowd workers. In this study, we propose a novel Learning from Crowds model, inspired by SelectiveNet proposed for the selective prediction problem. The proposed method called Label Selection Layer trains a prediction model by automatically determining whether to use a worker's label for training using a selector network. A major advantage of the proposed method is that it can be applied to almost all variants of supervised learning problems by simply adding a selector network and changing the objective function for existing models, without explicitly assuming a model of the noise in crowd annotations. The experimental results show that the performance of the proposed method is almost equivalent to or better than the Crowd Layer, which is one of the state-of-the-art methods for Deep Learning from Crowds, except for the regression problem case.

Related papers

Meta-learning Representations for Learning from Multiple Annotators [40.886894995806955]
We propose a meta-learning method for learning from multiple noisy annotators.<n>The proposed method embeds each example in tasks to a latent space by using a neural network.<n>We show the effectiveness of our method with real-world datasets with synthetic noise and real-world crowdsourcing datasets.
arXiv Detail & Related papers (2025-06-12T00:58:37Z)
Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages. In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z)
Combating Label Noise With A General Surrogate Model For Sample Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically. We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z)
Learning to Detect Noisy Labels Using Model-Based Features [16.681748918518075]
We propose Selection-Enhanced Noisy label Training (SENT) SENT does not rely on meta learning while having the flexibility of being data-driven. It improves performance over strong baselines under the settings of self-training and label corruption.
arXiv Detail & Related papers (2022-12-28T10:12:13Z)
UNICON: Combating Label Noise Through Uniform Selection and Contrastive Learning [89.56465237941013]
We propose UNICON, a simple yet effective sample selection method which is robust to high label noise. We obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate.
arXiv Detail & Related papers (2022-03-28T07:36:36Z)
Towards General and Efficient Active Learning [20.888364610175987]
Active learning aims to select the most informative samples to exploit limited annotation budgets. We propose a novel general and efficient active learning (GEAL) method in this paper. Our method can conduct data selection processes on different datasets with a single-pass inference of the same model.
arXiv Detail & Related papers (2021-12-15T08:35:28Z)
Robust Deep Learning from Crowds with Belief Propagation [6.643082745560235]
A graphical model representing local dependencies between workers and tasks provides a principled way of reasoning over the true labels from the noisy answers. One needs a predictive model working on unseen data directly from crowdsourced datasets instead of the true labels in many cases. We propose a new data-generating process, where a neural network generates the true labels from task features.
arXiv Detail & Related papers (2021-11-01T07:20:16Z)
Learning Debiased and Disentangled Representations for Semantic Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation. By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes. Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z)
Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images. We propose modifications and best practices aimed at minimizing human labeling effort. Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z)
Unsupervised Noisy Tracklet Person Re-identification [100.85530419892333]
We present a novel selective tracklet learning (STL) approach that can train discriminative person re-id models from unlabelled tracklet data. This avoids the tedious and costly process of exhaustively labelling person image/tracklet true matching pairs across camera views. Our method is particularly more robust against arbitrary noisy data of raw tracklets therefore scalable to learning discriminative models from unconstrained tracking data.
arXiv Detail & Related papers (2021-01-16T07:31:00Z)
Noisy Labels Can Induce Good Representations [53.47668632785373]
We study how architecture affects learning with noisy labels. We show that training with noisy labels can induce useful hidden representations, even when the model generalizes poorly. This finding leads to a simple method to improve models trained on noisy labels.
arXiv Detail & Related papers (2020-12-23T18:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.