Overcoming the curse of dimensionality with Laplacian regularization in
semi-supervised learning
- URL: http://arxiv.org/abs/2009.04324v4
- Date: Mon, 29 Nov 2021 12:29:05 GMT
- Title: Overcoming the curse of dimensionality with Laplacian regularization in
semi-supervised learning
- Authors: Vivien Cabannes, Loucas Pillaud-Vivien, Francis Bach, Alessandro Rudi
- Abstract summary: We provide a statistical analysis to overcome drawbacks of Laplacian regularization.
We unveil a large body of spectral filtering methods that exhibit desirable behaviors.
We provide realistic computational guidelines in order to make our method usable with large amounts of data.
- Score: 80.20302993614594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As annotations of data can be scarce in large-scale practical problems,
leveraging unlabelled examples is one of the most important aspects of machine
learning. This is the aim of semi-supervised learning. To benefit from the
access to unlabelled data, it is natural to diffuse smoothly knowledge of
labelled data to unlabelled one. This induces to the use of Laplacian
regularization. Yet, current implementations of Laplacian regularization suffer
from several drawbacks, notably the well-known curse of dimensionality. In this
paper, we provide a statistical analysis to overcome those issues, and unveil a
large body of spectral filtering methods that exhibit desirable behaviors. They
are implemented through (reproducing) kernel methods, for which we provide
realistic computational guidelines in order to make our method usable with
large amounts of data.
Related papers
- The Star Geometry of Critic-Based Regularizer Learning [2.2530496464901106]
Variational regularization is a technique to solve statistical inference tasks and inverse problems.
Recent works learn task-dependent regularizers by integrating information about the measurements and ground-truth data.
There is little theory about the structure of regularizers learned via this process and how it relates to the two data distributions.
arXiv Detail & Related papers (2024-08-29T18:34:59Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Domain Adaptive Multiple Instance Learning for Instance-level Prediction
of Pathological Images [45.132775668689604]
We propose a new task setting to improve the classification performance of the target dataset without increasing annotation costs.
In order to combine the supervisory information of both methods effectively, we propose a method to create pseudo-labels with high confidence.
arXiv Detail & Related papers (2023-04-07T08:31:06Z) - An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning [58.59343434538218]
We propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective.
Our approach can be implemented in just few lines of code by only using off-the-shelf operations.
arXiv Detail & Related papers (2022-09-28T02:11:34Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z) - Density Fixing: Simple yet Effective Regularization Method based on the
Class Prior [2.3859169601259347]
We propose a framework of regularization methods, called density-fixing, that can be used commonly for supervised and semi-supervised learning.
Our proposed regularization method improves the generalization performance by forcing the model to approximate the class's prior distribution or the frequency of occurrence.
arXiv Detail & Related papers (2020-07-08T04:58:22Z) - Supervised Visualization for Data Exploration [9.742277703732187]
We describe a novel supervised visualization technique based on random forest proximities and diffusion-based dimensionality reduction.
Our approach is robust to noise and parameter tuning, thus making it simple to use while producing reliable visualizations for data exploration.
arXiv Detail & Related papers (2020-06-15T19:10:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.