Semi-supervised binary classification with latent distance learning
- URL: http://arxiv.org/abs/2211.15153v1
- Date: Mon, 28 Nov 2022 09:05:26 GMT
- Title: Semi-supervised binary classification with latent distance learning
- Authors: Imam Mustafa Kamal and Hyerim Bae
- Abstract summary: We propose a new learning representation to solve the binary classification problem using a few labels with a random k-pair cross-distance learning mechanism.
With few labels and without any data augmentation techniques, the proposed method outperformed state-of-the-art semi-supervised and self-supervised learning methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Binary classification (BC) is a practical task that is ubiquitous in
real-world problems, such as distinguishing healthy and unhealthy objects in
biomedical diagnostics and defective and non-defective products in
manufacturing inspections. Nonetheless, fully annotated data are commonly
required to effectively solve this problem, and their collection by domain
experts is a tedious and expensive procedure. In contrast to BC, several
significant semi-supervised learning techniques that heavily rely on stochastic
data augmentation techniques have been devised for solving multi-class
classification. In this study, we demonstrate that the stochastic data
augmentation technique is less suitable for solving typical BC problems because
it can omit crucial features that strictly distinguish between positive and
negative samples. To address this issue, we propose a new learning
representation to solve the BC problem using a few labels with a random k-pair
cross-distance learning mechanism. First, by harnessing a few labeled samples,
the encoder network learns the projection of positive and negative samples in
angular spaces to maximize and minimize their inter-class and intra-class
distances, respectively. Second, the classifier learns to discriminate between
positive and negative samples using on-the-fly labels generated based on the
angular space and labeled samples to solve BC tasks. Extensive experiments were
conducted using four real-world publicly available BC datasets. With few labels
and without any data augmentation techniques, the proposed method outperformed
state-of-the-art semi-supervised and self-supervised learning methods.
Moreover, with 10% labeling, our semi-supervised classifier could obtain
competitive accuracy compared with a fully supervised setting.
Related papers
- Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated
MR Images [2.352695945685781]
We propose a new method that employs transfer learning techniques to correct sampling selection errors introduced by sparse annotations during supervised learning for automated tumor segmentation.
The proposed method derives high-quality classifiers for the different tissue classes from sparse and unambiguous annotations.
Compared to training on fully labeled data, we reduced the time for labeling and training by a factor greater than 70 and 180 respectively without sacrificing accuracy.
arXiv Detail & Related papers (2024-03-12T09:17:21Z) - Performance Evaluation of Semi-supervised Learning Frameworks for
Multi-Class Weed Detection [15.828967396019143]
Effective weed control plays a crucial role in optimizing crop yield and enhancing agricultural product quality.
Recent advances in precision weed management enabled by ML and DL provide a sustainable alternative.
Semi-supervised learning methods, especially semi-supervised learning, have gained increased attention in the broader domain of computer vision.
arXiv Detail & Related papers (2024-03-06T00:59:51Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Revisiting Class Imbalance for End-to-end Semi-Supervised Object
Detection [1.6249267147413524]
Semi-supervised object detection (SSOD) has made significant progress with the development of pseudo-label-based end-to-end methods.
Many methods face challenges due to class imbalance, which hinders the effectiveness of the pseudo-label generator.
In this paper, we examine the root causes of low-quality pseudo-labels and present novel learning mechanisms to improve the label generation quality.
arXiv Detail & Related papers (2023-06-04T06:01:53Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Hierarchical Semi-Supervised Contrastive Learning for
Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution.
Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies.
We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z) - A multi-stage semi-supervised improved deep embedded clustering
(MS-SSIDEC) method for bearing fault diagnosis under the situation of
insufficient labeled samples [20.952315351460527]
It costs a lot of labor and time to label data in actual industrial processes, which challenges the application of intelligent fault diagnosis methods.
To solve this problem, a multi-stage semi-supervised improved deep embedded clustering (MS-SSIDEC) method is proposed.
This method includes three stages: pre-training, deep clustering and enhanced supervised learning.
arXiv Detail & Related papers (2021-09-28T06:49:40Z) - Deep Semi-supervised Metric Learning with Dual Alignment for Cervical
Cancer Cell Detection [49.78612417406883]
We propose a novel semi-supervised deep metric learning method for cervical cancer cell detection.
Our model learns an embedding metric space and conducts dual alignment of semantic features on both the proposal and prototype levels.
We construct a large-scale dataset for semi-supervised cervical cancer cell detection for the first time, consisting of 240,860 cervical cell images.
arXiv Detail & Related papers (2021-04-07T17:11:27Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Semi-supervised and Unsupervised Methods for Heart Sounds Classification
in Restricted Data Environments [4.712158833534046]
This study uses various supervised, semi-supervised and unsupervised approaches on the PhysioNet/CinC 2016 Challenge dataset.
A GAN based semi-supervised method is proposed, which allows the usage of unlabelled data samples to boost the learning of data distribution.
In particular, the unsupervised feature extraction using 1D CNN Autoencoder coupled with one-class SVM obtains good performance without any data labelling.
arXiv Detail & Related papers (2020-06-04T02:07:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.