Handling Missing Annotations in Supervised Learning Data
- URL: http://arxiv.org/abs/2002.07113v1
- Date: Mon, 17 Feb 2020 18:23:56 GMT
- Title: Handling Missing Annotations in Supervised Learning Data
- Authors: Alaa E. Abdel-Hakim and Wael Deabes
- Abstract summary: Activity of Daily Living (ADL) recognition is an example of systems that exploit very large raw sensor data readings.
The size of the generated dataset is so huge that it is almost impossible for a human annotator to give a certain label to every single instance in the dataset.
In this work, we propose and investigate three different paradigms to handle these gaps.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data annotation is an essential stage in supervised learning. However, the
annotation process is exhaustive and time consuming, specially for large
datasets. Activities of Daily Living (ADL) recognition is an example of systems
that exploit very large raw sensor data readings. In such systems, sensor
readings are collected from activity-monitoring sensors in a 24/7 manner. The
size of the generated dataset is so huge that it is almost impossible for a
human annotator to give a certain label to every single instance in the
dataset. This results in annotation gaps in the input data to the adopting
supervised learning system. The performance of the recognition system is
negatively affected by these gaps. In this work, we propose and investigate
three different paradigms to handle these gaps. In the first paradigm, the gaps
are taken out by dropping all unlabeled readings. A single "Unknown" or
"Do-Nothing" label is given to the unlabeled readings within the operation of
the second paradigm. The last paradigm handles these gaps by giving every one
of them a unique label identifying the encapsulating deterministic labels.
Also, we propose a semantic preprocessing method of annotation gaps by
constructing a hybrid combination of some of these paradigms for further
performance improvement. The performance of the proposed three paradigms and
their hybrid combination is evaluated using an ADL benchmark dataset containing
more than $2.5\times 10^6$ sensor readings that had been collected over more
than nine months. The evaluation results emphasize the performance contrast
under the operation of each paradigm and support a specific gap handling
approach for better performance.
Related papers
- Enhancing Hyperspectral Image Prediction with Contrastive Learning in Low-Label Regime [0.810304644344495]
Self-supervised contrastive learning is an effective approach for addressing the challenge of limited labelled data.
We evaluate the method's performance for both the single-label and multi-label classification tasks.
arXiv Detail & Related papers (2024-10-10T10:20:16Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Semi-Supervised End-To-End Contrastive Learning For Time Series
Classification [10.635321868623883]
Time series classification is a critical task in various domains, such as finance, healthcare, and sensor data analysis.
We propose an end-to-end model called SLOTS (Semi-supervised Learning fOr Time clasSification)
arXiv Detail & Related papers (2023-10-13T04:22:21Z) - Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object
Detection with Repeated Labels [6.872072177648135]
We propose a novel localization algorithm that adapts well-established ground truth estimation methods.
Our algorithm also shows superior performance during training on the TexBiG dataset.
arXiv Detail & Related papers (2023-09-18T13:08:44Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - SparseDet: Improving Sparsely Annotated Object Detection with
Pseudo-positive Mining [76.95808270536318]
We propose an end-to-end system that learns to separate proposals into labeled and unlabeled regions using Pseudo-positive mining.
While the labeled regions are processed as usual, self-supervised learning is used to process the unlabeled regions.
We conduct exhaustive experiments on five splits on the PASCAL-VOC and COCO datasets achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-01-12T18:57:04Z) - Semi-Automatic Data Annotation guided by Feature Space Projection [117.9296191012968]
We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation.
We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities.
Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning.
arXiv Detail & Related papers (2020-07-27T17:03:50Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.