Optimizing Diffusion Rate and Label Reliability in a Graph-Based
Semi-supervised Classifier
- URL: http://arxiv.org/abs/2201.03456v1
- Date: Mon, 10 Jan 2022 16:58:52 GMT
- Title: Optimizing Diffusion Rate and Label Reliability in a Graph-Based
Semi-supervised Classifier
- Authors: Bruno Klaus de Aquino Afonso, Lilian Berton
- Abstract summary: The Local and Global Consistency (LGC) algorithm is one of the most well-known graph-based semi-supervised (GSSL) classifiers.
We discuss how removing the self-influence of a labeled instance may be beneficial, and how it relates to leave-one-out error.
Within this framework, we propose methods to estimate label reliability and diffusion rate.
- Score: 2.4366811507669124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semi-supervised learning has received attention from researchers, as it
allows one to exploit the structure of unlabeled data to achieve competitive
classification results with much fewer labels than supervised approaches. The
Local and Global Consistency (LGC) algorithm is one of the most well-known
graph-based semi-supervised (GSSL) classifiers. Notably, its solution can be
written as a linear combination of the known labels. The coefficients of this
linear combination depend on a parameter $\alpha$, determining the decay of the
reward over time when reaching labeled vertices in a random walk. In this work,
we discuss how removing the self-influence of a labeled instance may be
beneficial, and how it relates to leave-one-out error. Moreover, we propose to
minimize this leave-one-out loss with automatic differentiation. Within this
framework, we propose methods to estimate label reliability and diffusion rate.
Optimizing the diffusion rate is more efficiently accomplished with a spectral
representation. Results show that the label reliability approach competes with
robust L1-norm methods and that removing diagonal entries reduces the risk of
overfitting and leads to suitable criteria for parameter selection.
Related papers
- Inaccurate Label Distribution Learning with Dependency Noise [52.08553913094809]
We introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning.
We show that DN-ILDL effectively addresses the ILDL problem and outperforms existing LDL methods.
arXiv Detail & Related papers (2024-05-26T07:58:07Z) - Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Plug-and-Play Pseudo Label Correction Network for Unsupervised Person
Re-identification [36.3733132520186]
We propose a graph-based pseudo label correction network (GLC) to refine the pseudo labels in the manner of supervised clustering.
GLC learns to rectify the initial noisy labels by means of the relationship constraints between samples on the k Nearest Neighbor graph.
Our method is widely compatible with various clustering-based methods and promotes the state-of-the-art performance consistently.
arXiv Detail & Related papers (2022-06-14T05:59:37Z) - Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data.
We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.