Semi-supervised Learning with Deterministic Labeling and Large Margin
Projection
- URL: http://arxiv.org/abs/2208.08058v1
- Date: Wed, 17 Aug 2022 04:09:35 GMT
- Title: Semi-supervised Learning with Deterministic Labeling and Large Margin
Projection
- Authors: Ji Xu, Gang Ren, Yao Xiao, Shaobo Li, Guoyin Wang
- Abstract summary: The centrality and diversity of the labeled data are very influential to the performance of semi-supervised learning (SSL)
This study is to learn a kernelized large margin metric for a small amount of most stable and most divergent data that are recognized based on the OLF structure.
Attribute to this novel design, the accuracy and performance stableness of the SSL model based on OLF is significantly improved compared with its baseline methods.
- Score: 25.398314796157933
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The centrality and diversity of the labeled data are very influential to the
performance of semi-supervised learning (SSL), but most SSL models select the
labeled data randomly. How to guarantee the centrality and diversity of the
labeled data has so far received little research attention. Optimal leading
forest (OLF) has been observed to have the advantage of revealing the
difference evolution within a class when it was utilized to develop an SSL
model. Our key intuition of this study is to learn a kernelized large margin
metric for a small amount of most stable and most divergent data that are
recognized based on the OLF structure. An optimization problem is formulated to
achieve this goal. Also with OLF the multiple local metrics learning is
facilitated to address multi-modal and mix-modal problem in SSL. Attribute to
this novel design, the accuracy and performance stableness of the SSL model
based on OLF is significantly improved compared with its baseline methods
without sacrificing much efficiency. The experimental studies have shown that
the proposed method achieved encouraging accuracy and running time when
compared to the state-of-the-art graph SSL methods. Code has been made
available at https://github.com/alanxuji/DeLaLA.
Related papers
- Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning [4.137391543972184]
Semi-supervised learning (SSL) has witnessed remarkable progress, resulting in numerous method variations.
In this paper, we present a novel SSL approach named FineSSL that significantly addresses this limitation by adapting pre-trained foundation models.
We demonstrate that FineSSL sets a new state of the art for SSL on multiple benchmark datasets, reduces the training cost by over six times, and can seamlessly integrate various fine-tuning and modern SSL algorithms.
arXiv Detail & Related papers (2024-05-20T03:33:12Z) - Reinforcement Learning-Guided Semi-Supervised Learning [20.599506122857328]
We propose a novel Reinforcement Learning Guided SSL method, RLGSSL, that formulates SSL as a one-armed bandit problem.
RLGSSL incorporates a carefully designed reward function that balances the use of labeled and unlabeled data to enhance generalization performance.
We demonstrate the effectiveness of RLGSSL through extensive experiments on several benchmark datasets and show that our approach achieves consistent superior performance compared to state-of-the-art SSL methods.
arXiv Detail & Related papers (2024-05-02T21:52:24Z) - On Pretraining Data Diversity for Self-Supervised Learning [57.91495006862553]
We explore the impact of training with more diverse datasets on the performance of self-supervised learning (SSL) under a fixed computational budget.
Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal.
arXiv Detail & Related papers (2024-03-20T17:59:58Z) - On the Effectiveness of Out-of-Distribution Data in Self-Supervised
Long-Tail Learning [15.276356824489431]
We propose Contrastive with Out-of-distribution (OOD) data for Long-Tail learning (COLT)
We empirically identify the counter-intuitive usefulness of OOD samples in SSL long-tailed learning.
Our method significantly improves the performance of SSL on long-tailed datasets by a large margin.
arXiv Detail & Related papers (2023-06-08T04:32:10Z) - Benchmark for Uncertainty & Robustness in Self-Supervised Learning [0.0]
Self-Supervised Learning is crucial for real-world applications, especially in data-hungry domains such as healthcare and self-driving cars.
In this paper, we explore variants of SSL methods, including Jigsaw Puzzles, Context, Rotation, Geometric Transformations Prediction for vision, as well as BERT and GPT for language tasks.
Our goal is to create a benchmark with outputs from experiments, providing a starting point for new SSL methods in Reliable Machine Learning.
arXiv Detail & Related papers (2022-12-23T15:46:23Z) - OpenLDN: Learning to Discover Novel Classes for Open-World
Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning.
Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data.
This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Self-Supervised Learning of Graph Neural Networks: A Unified Review [50.71341657322391]
Self-supervised learning is emerging as a new paradigm for making use of large amounts of unlabeled samples.
We provide a unified review of different ways of training graph neural networks (GNNs) using SSL.
Our treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
arXiv Detail & Related papers (2021-02-22T03:43:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.