Robust Deep Semi-Supervised Learning: A Brief Introduction
- URL: http://arxiv.org/abs/2202.05975v1
- Date: Sat, 12 Feb 2022 04:16:41 GMT
- Title: Robust Deep Semi-Supervised Learning: A Brief Introduction
- Authors: Lan-Zhe Guo and Zhi Zhou and Yu-Feng Li
- Abstract summary: Semi-supervised learning (SSL) aims to improve learning performance by leveraging unlabeled data when labels are insufficient.
SSL with deep models has proven to be successful on standard benchmark tasks.
However, they are still vulnerable to various robustness threats in real-world applications.
- Score: 63.09703308309176
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning (SSL) is the branch of machine learning that aims to
improve learning performance by leveraging unlabeled data when labels are
insufficient. Recently, SSL with deep models has proven to be successful on
standard benchmark tasks. However, they are still vulnerable to various
robustness threats in real-world applications as these benchmarks provide
perfect unlabeled data, while in realistic scenarios, unlabeled data could be
corrupted. Many researchers have pointed out that after exploiting corrupted
unlabeled data, SSL suffers severe performance degradation problems. Thus,
there is an urgent need to develop SSL algorithms that could work robustly with
corrupted unlabeled data. To fully understand robust SSL, we conduct a survey
study. We first clarify a formal definition of robust SSL from the perspective
of machine learning. Then, we classify the robustness threats into three
categories: i) distribution corruption, i.e., unlabeled data distribution is
mismatched with labeled data; ii) feature corruption, i.e., the features of
unlabeled examples are adversarially attacked; and iii) label corruption, i.e.,
the label distribution of unlabeled data is imbalanced. Under this unified
taxonomy, we provide a thorough review and discussion of recent works that
focus on these issues. Finally, we propose possible promising directions within
robust SSL to provide insights for future research.
Related papers
- OwMatch: Conditional Self-Labeling with Consistency for Open-World Semi-Supervised Learning [4.462726364160216]
Semi-supervised learning (SSL) offers a robust framework for harnessing the potential of unannotated data.
The emergence of open-world SSL (OwSSL) introduces a more practical challenge, wherein unlabeled data may encompass samples from unseen classes.
We propose an effective framework called OwMatch, combining conditional self-labeling and open-world hierarchical thresholding.
arXiv Detail & Related papers (2024-11-04T06:07:43Z) - FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness
for Semi-Supervised Learning [73.13448439554497]
Semi-Supervised Learning (SSL) has been an effective way to leverage abundant unlabeled data with extremely scarce labeled data.
Most SSL methods are commonly based on instance-wise consistency between different data transformations.
We propose FlatMatch which minimizes a cross-sharpness measure to ensure consistent learning performance between the two datasets.
arXiv Detail & Related papers (2023-10-25T06:57:59Z) - Contrastive Credibility Propagation for Reliable Semi-Supervised Learning [6.014538614447467]
We propose Contrastive Credibility Propagation (CCP) for deep SSL via iterative transductive pseudo-label refinement.
CCP unifies semi-supervised learning and noisy label learning for the goal of reliably outperforming a supervised baseline in any data scenario.
arXiv Detail & Related papers (2022-11-17T23:01:47Z) - Learning to Infer from Unlabeled Data: A Semi-supervised Learning
Approach for Robust Natural Language Inference [47.293189105900524]
Natural Language Inference (NLI) aims at predicting the relation between a pair of sentences (premise and hypothesis) as entailment, contradiction or semantic independence.
Deep learning models have shown promising performance for NLI in recent years, they rely on large scale expensive human-annotated datasets.
Semi-supervised learning (SSL) is a popular technique for reducing the reliance on human annotation by leveraging unlabeled data for training.
arXiv Detail & Related papers (2022-11-05T20:34:08Z) - Complementing Semi-Supervised Learning with Uncertainty Quantification [6.612035830987296]
We propose a novel unsupervised uncertainty-aware objective that relies on aleatoric and epistemic uncertainty quantification.
Our results outperform the state-of-the-art results on complex datasets such as CIFAR-100 and Mini-ImageNet.
arXiv Detail & Related papers (2022-07-22T00:15:02Z) - Towards Realistic Semi-Supervised Learning [73.59557447798134]
We propose a novel approach to tackle SSL in open-world setting, where we simultaneously learn to classify known and unknown classes.
Our approach substantially outperforms the existing state-of-the-art on seven diverse datasets.
arXiv Detail & Related papers (2022-07-05T19:04:43Z) - OpenLDN: Learning to Discover Novel Classes for Open-World
Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning.
Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data.
This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z) - On Non-Random Missing Labels in Semi-Supervised Learning [114.62655062520425]
Semi-Supervised Learning (SSL) is fundamentally a missing label problem.
We explicitly incorporate "class" into SSL.
Our method not only significantly outperforms existing baselines but also surpasses other label bias removal SSL methods.
arXiv Detail & Related papers (2022-06-29T22:01:29Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.