Diffusion-based Semi-supervised Spectral Algorithm for Regression on Manifolds
- URL: http://arxiv.org/abs/2410.14539v1
- Date: Fri, 18 Oct 2024 15:29:04 GMT
- Title: Diffusion-based Semi-supervised Spectral Algorithm for Regression on Manifolds
- Authors: Weichun Xia, Jiaxin Jiang, Lei Shi,
- Abstract summary: We introduce a novel diffusion-based spectral algorithm to tackle regression analysis on high-dimensional data.
Our method uses the local estimation property of heat kernel, offering an adaptive, data-driven approach to overcome this obstacle.
Our algorithm performs in an entirely data-driven manner, operating directly within the intrinsic manifold structure of the data.
- Score: 2.0649432688817444
- License:
- Abstract: We introduce a novel diffusion-based spectral algorithm to tackle regression analysis on high-dimensional data, particularly data embedded within lower-dimensional manifolds. Traditional spectral algorithms often fall short in such contexts, primarily due to the reliance on predetermined kernel functions, which inadequately address the complex structures inherent in manifold-based data. By employing graph Laplacian approximation, our method uses the local estimation property of heat kernel, offering an adaptive, data-driven approach to overcome this obstacle. Another distinct advantage of our algorithm lies in its semi-supervised learning framework, enabling it to fully use the additional unlabeled data. This ability enhances the performance by allowing the algorithm to dig the spectrum and curvature of the data manifold, providing a more comprehensive understanding of the dataset. Moreover, our algorithm performs in an entirely data-driven manner, operating directly within the intrinsic manifold structure of the data, without requiring any predefined manifold information. We provide a convergence analysis of our algorithm. Our findings reveal that the algorithm achieves a convergence rate that depends solely on the intrinsic dimension of the underlying manifold, thereby avoiding the curse of dimensionality associated with the higher ambient dimension.
Related papers
- Structured Prediction in Online Learning [66.36004256710824]
We study a theoretical and algorithmic framework for structured prediction in the online learning setting.
We show that our algorithm is a generalisation of optimal algorithms from the supervised learning setting.
We consider a second algorithm designed especially for non-stationary data distributions, including adversarial data.
arXiv Detail & Related papers (2024-06-18T07:45:02Z) - Datacube segmentation via Deep Spectral Clustering [76.48544221010424]
Extended Vision techniques often pose a challenge in their interpretation.
The huge dimensionality of data cube spectra poses a complex task in its statistical interpretation.
In this paper, we explore the possibility of applying unsupervised clustering methods in encoded space.
A statistical dimensional reduction is performed by an ad hoc trained (Variational) AutoEncoder, while the clustering process is performed by a (learnable) iterative K-Means clustering algorithm.
arXiv Detail & Related papers (2024-01-31T09:31:28Z) - Integral Operator Approaches for Scattered Data Fitting on Spheres [16.389581549801253]
We study the approximation performance of a class of weighted spectral filter algorithms.
We derive optimal Sobolev-type error estimates of weighted spectral filter algorithms.
arXiv Detail & Related papers (2024-01-27T04:42:50Z) - A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction [66.21060114843202]
We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings.
Results show that our method outperforms existing state of the art in preserving ground truth manifold distances.
We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
arXiv Detail & Related papers (2023-05-30T13:58:50Z) - Learning Structure Aware Deep Spectral Embedding [11.509692423756448]
We propose a novel structure-aware deep spectral embedding by combining a spectral embedding loss and a structure preservation loss.
A deep neural network architecture is proposed that simultaneously encodes both types of information and aims to generate structure-aware spectral embedding.
The proposed algorithm is evaluated on six publicly available real-world datasets.
arXiv Detail & Related papers (2023-05-14T18:18:05Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Learning Low-Dimensional Nonlinear Structures from High-Dimensional
Noisy Data: An Integral Operator Approach [5.975670441166475]
We propose a kernel-spectral embedding algorithm for learning low-dimensional nonlinear structures from high-dimensional and noisy observations.
The algorithm employs an adaptive bandwidth selection procedure which does not rely on prior knowledge of the underlying manifold.
The obtained low-dimensional embeddings can be further utilized for downstream purposes such as data visualization, clustering and prediction.
arXiv Detail & Related papers (2022-02-28T22:46:34Z) - ExClus: Explainable Clustering on Low-dimensional Data Representations [9.496898312608307]
Dimensionality reduction and clustering techniques are frequently used to analyze complex data sets, but their results are often not easy to interpret.
We consider how to support users in interpreting apparent cluster structure on scatter plots where the axes are not directly interpretable.
We propose a new method to compute an interpretable clustering automatically, where the explanation is in the original high-dimensional space and the clustering is coherent in the low-dimensional projection.
arXiv Detail & Related papers (2021-11-04T21:24:01Z) - Inferring Manifolds From Noisy Data Using Gaussian Processes [17.166283428199634]
Most existing manifold learning algorithms replace the original data with lower dimensional coordinates.
This article proposes a new methodology for addressing these problems, allowing the estimated manifold between fitted data points.
arXiv Detail & Related papers (2021-10-14T15:50:38Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.