Semi-Automatic Data Annotation guided by Feature Space Projection
- URL: http://arxiv.org/abs/2007.13689v1
- Date: Mon, 27 Jul 2020 17:03:50 GMT
- Title: Semi-Automatic Data Annotation guided by Feature Space Projection
- Authors: Barbara Caroline Benato and Jancarlo Ferreira Gomes and Alexandru
Cristian Telea and Alexandre Xavier Falc\~ao
- Abstract summary: We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation.
We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities.
Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning.
- Score: 117.9296191012968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data annotation using visual inspection (supervision) of each training sample
can be laborious. Interactive solutions alleviate this by helping experts
propagate labels from a few supervised samples to unlabeled ones based solely
on the visual analysis of their feature space projection (with no further
sample supervision). We present a semi-automatic data annotation approach based
on suitable feature space projection and semi-supervised label estimation. We
validate our method on the popular MNIST dataset and on images of human
intestinal parasites with and without fecal impurities, a large and diverse
dataset that makes classification very hard. We evaluate two approaches for
semi-supervised learning from the latent and projection spaces, to choose the
one that best reduces user annotation effort and also increases classification
accuracy on unseen data. Our results demonstrate the added-value of visual
analytics tools that combine complementary abilities of humans and machines for
more effective machine learning.
Related papers
- TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection [59.498894868956306]
Pseudo-labeling approaches to semi-supervised learning adopt a teacher-student framework.
We leverage pre-trained motion-forecasting models to generate object trajectories on pseudo-labeled data.
Our approach improves pseudo-label quality in two distinct manners.
arXiv Detail & Related papers (2024-09-17T05:35:00Z) - Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Debiased Learning for Remote Sensing Data [29.794246747637104]
We propose a highly effective semi-supervised approach tailored specifically to remote sensing data.
First, we adapt the FixMatch framework to remote sensing data by designing robust strong and weak augmentations suitable for this domain.
Second, we develop an effective semi-supervised learning method by removing bias in imbalanced training data resulting from both actual labels and pseudo-labels predicted by the model.
arXiv Detail & Related papers (2023-12-24T03:33:30Z) - Semi-Supervised Learning for hyperspectral images by non parametrically
predicting view assignment [25.198550162904713]
Hyperspectral image (HSI) classification is gaining a lot of momentum in present time because of high inherent spectral information within the images.
Recently, to effectively train the deep learning models with minimal labelled samples, the unlabeled samples are also being leveraged in self-supervised and semi-supervised setting.
In this work, we leverage the idea of semi-supervised learning to assist the discriminative self-supervised pretraining of the models.
arXiv Detail & Related papers (2023-06-19T14:13:56Z) - Linking data separation, visual separation, and classifier performance
using pseudo-labeling by contrastive learning [125.99533416395765]
We argue that the performance of the final classifier depends on the data separation present in the latent space and visual separation present in the projection.
We demonstrate our results by the classification of five real-world challenging image datasets of human intestinal parasites with only 1% supervised samples.
arXiv Detail & Related papers (2023-02-06T10:01:38Z) - A semi-supervised Teacher-Student framework for surgical tool detection
and localization [2.41710192205034]
We introduce a semi-supervised learning (SSL) framework in surgical tool detection paradigm.
In the proposed work, we train a model with labeled data which initialises the Teacher-Student joint learning.
Our results on m2cai16-tool-locations dataset indicate the superiority of our approach on different supervised data settings.
arXiv Detail & Related papers (2022-08-21T17:21:31Z) - Clustering augmented Self-Supervised Learning: Anapplication to Land
Cover Mapping [10.720852987343896]
We introduce a new method for land cover mapping by using a clustering based pretext task for self-supervised learning.
We demonstrate the effectiveness of the method on two societally relevant applications.
arXiv Detail & Related papers (2021-08-16T19:35:43Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z) - Learning to Count in the Crowd from Limited Labeled Data [109.2954525909007]
We focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples.
Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data.
arXiv Detail & Related papers (2020-07-07T04:17:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.