Related papers: Deep Low-Density Separation for Semi-Supervised Classification

Deep Low-Density Separation for Semi-Supervised Classification

URL: http://arxiv.org/abs/2205.11995v1
Date: Sun, 22 May 2022 11:00:55 GMT
Title: Deep Low-Density Separation for Semi-Supervised Classification
Authors: Michael C. Burkhart and Kyle Shan
Abstract summary: We introduce a novel hybrid method that applies low-density separation to the embedded features. Our approach effectively classifies thousands of unlabeled users from a relatively small number of hand-classified examples.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Given a small set of labeled data and a large set of unlabeled data, semi-supervised learning (SSL) attempts to leverage the location of the unlabeled datapoints in order to create a better classifier than could be obtained from supervised methods applied to the labeled training set alone. Effective SSL imposes structural assumptions on the data, e.g. that neighbors are more likely to share a classification or that the decision boundary lies in an area of low density. For complex and high-dimensional data, neural networks can learn feature embeddings to which traditional SSL methods can then be applied in what we call hybrid methods. Previously-developed hybrid methods iterate between refining a latent representation and performing graph-based SSL on this representation. In this paper, we introduce a novel hybrid method that instead applies low-density separation to the embedded features. We describe it in detail and discuss why low-density separation may be better suited for SSL on neural network-based embeddings than graph-based algorithms. We validate our method using in-house customer survey data and compare it to other state-of-the-art learning methods. Our approach effectively classifies thousands of unlabeled users from a relatively small number of hand-classified examples.

Related papers

Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples. We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning. Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z)
Semi-Supervised Sparse Gaussian Classification: Provable Benefits of Unlabeled Data [6.812609988733991]
We study SSL for high dimensional Gaussian classification. We analyze information theoretic lower bounds for accurate feature selection. We present simulations that complement our theoretical analysis.
arXiv Detail & Related papers (2024-09-05T08:21:05Z)
A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels. We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z)
ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning [60.57998388590556]
ProtoCon is a novel method for confidence-based pseudo-labeling. Online nature of ProtoCon allows it to utilise the label history of the entire dataset in one training cycle. It delivers significant gains and faster convergence over state-of-the-art datasets.
arXiv Detail & Related papers (2023-03-22T23:51:54Z)
OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning. Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data. This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z)
Self-Supervised Learning of Graph Neural Networks: A Unified Review [50.71341657322391]
Self-supervised learning is emerging as a new paradigm for making use of large amounts of unlabeled samples. We provide a unified review of different ways of training graph neural networks (GNNs) using SSL. Our treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms.
arXiv Detail & Related papers (2021-02-22T03:43:45Z)
Matching Distributions via Optimal Transport for Semi-Supervised Learning [31.533832244923843]
Semi-Supervised Learning (SSL) approaches have been an influential framework for the usage of unlabeled data. We propose a new approach that adopts an Optimal Transport (OT) technique serving as a metric of similarity between discrete empirical probability measures. We have evaluated our proposed method with state-of-the-art SSL algorithms on standard datasets to demonstrate the superiority and effectiveness of our SSL algorithm.
arXiv Detail & Related papers (2020-12-04T11:15:14Z)
OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax Layer [77.90012156266324]
This paper aims to find a subspace of neural networks that can facilitate a large decision margin. We propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain during both the training and test processes. Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets.
arXiv Detail & Related papers (2020-04-20T02:41:01Z)
Density-Aware Graph for Deep Semi-Supervised Visual Recognition [102.9484812869054]
Semi-supervised learning (SSL) has been extensively studied to improve the generalization ability of deep neural networks for visual recognition. This paper proposes to solve the SSL problem by building a novel density-aware graph, based on which the neighborhood information can be easily leveraged.
arXiv Detail & Related papers (2020-03-30T02:52:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.