Posterior Consistency of Semi-Supervised Regression on Graphs
- URL: http://arxiv.org/abs/2007.12809v2
- Date: Wed, 24 Mar 2021 13:25:21 GMT
- Title: Posterior Consistency of Semi-Supervised Regression on Graphs
- Authors: Andrea L. Bertozzi, Bamdad Hosseini, Hao Li, Kevin Miller, Andrew M.
Stuart
- Abstract summary: Graph-based semi-supervised regression (SSR) is the problem of estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices.
This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered nodes.
We present a Bayesian formulation of SSR in which the weighted graph defines a Gaussian prior, using a graph Laplacian, and the labeled data defines a likelihood.
- Score: 14.65047105712853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph-based semi-supervised regression (SSR) is the problem of estimating the
value of a function on a weighted graph from its values (labels) on a small
subset of the vertices. This paper is concerned with the consistency of SSR in
the context of classification, in the setting where the labels have small noise
and the underlying graph weighting is consistent with well-clustered nodes. We
present a Bayesian formulation of SSR in which the weighted graph defines a
Gaussian prior, using a graph Laplacian, and the labeled data defines a
likelihood. We analyze the rate of contraction of the posterior measure around
the ground truth in terms of parameters that quantify the small label error and
inherent clustering in the graph. We obtain bounds on the rates of contraction
and illustrate their sharpness through numerical experiments. The analysis also
gives insight into the choice of hyperparameters that enter the definition of
the prior.
Related papers
- Graph Neural Networks with a Distribution of Parametrized Graphs [27.40566674759208]
We introduce latent variables to parameterize and generate multiple graphs.
We obtain the maximum likelihood estimate of the network parameters in an Expectation-Maximization framework.
arXiv Detail & Related papers (2023-10-25T06:38:24Z) - Graph Out-of-Distribution Generalization with Controllable Data
Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties.
Due to the selection bias of training and testing data, distribution deviation is widespread.
We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z) - The Graphical Nadaraya-Watson Estimator on Latent Position Models [0.0]
We are interested in the quality of the averaging estimator which for an unlabeled node predicts the average of the observations of its labeled neighbors.
While the estimator itself is very simple we believe that our results will contribute towards the theoretical understanding of learning on graphs through more sophisticated methods such as Graph Neural Networks.
arXiv Detail & Related papers (2023-03-30T08:56:28Z) - Graph Signal Sampling for Inductive One-Bit Matrix Completion: a
Closed-form Solution [112.3443939502313]
We propose a unified graph signal sampling framework which enjoys the benefits of graph signal analysis and processing.
The key idea is to transform each user's ratings on the items to a function (signal) on the vertices of an item-item graph.
For the online setting, we develop a Bayesian extension, i.e., BGS-IMC which considers continuous random Gaussian noise in the graph Fourier domain.
arXiv Detail & Related papers (2023-02-08T08:17:43Z) - How Does Pseudo-Labeling Affect the Generalization Error of the
Semi-Supervised Gibbs Algorithm? [73.80001705134147]
We provide an exact characterization of the expected generalization error (gen-error) for semi-supervised learning (SSL) with pseudo-labeling via the Gibbs algorithm.
The gen-error is expressed in terms of the symmetrized KL information between the output hypothesis, the pseudo-labeled dataset, and the labeled dataset.
arXiv Detail & Related papers (2022-10-15T04:11:56Z) - Variance estimation in graphs with the fused lasso [7.732474038706013]
We develop a linear time estimator for the homoscedastic case that can consistently estimate the variance in general graphs.
We show that our estimator attains minimax rates for the chain and 2D grid graphs when the mean signal has total variation with canonical scaling.
arXiv Detail & Related papers (2022-07-26T03:50:51Z) - Graph Condensation via Receptive Field Distribution Matching [61.71711656856704]
This paper focuses on creating a small graph to represent the original graph, so that GNNs trained on the size-reduced graph can make accurate predictions.
We view the original graph as a distribution of receptive fields and aim to synthesize a small graph whose receptive fields share a similar distribution.
arXiv Detail & Related papers (2022-06-28T02:10:05Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - A Thorough View of Exact Inference in Graphs from the Degree-4
Sum-of-Squares Hierarchy [37.34153902687548]
We tackle the problem of exactly recovering an unknown ground-truth binary labeling of the nodes from a single corrupted observation of each edge.
We apply a hierarchy of relaxations known as the sum-of-squares hierarchy, to the problem.
We show that the solution of the dual of the relaxed problem is related to finding edge weights of the Johnson and Kneser graphs.
arXiv Detail & Related papers (2021-02-16T08:36:19Z) - Residual Correlation in Graph Neural Network Regression [39.54530450932135]
We show that conditional independence assumption severely limits predictive power.
We address this problem with an interpretable and efficient framework.
Our framework achieves substantially higher accuracy than competing baselines.
arXiv Detail & Related papers (2020-02-19T16:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.