Graphon Estimation in bipartite graphs with observable edge labels and
unobservable node labels
- URL: http://arxiv.org/abs/2304.03590v2
- Date: Mon, 4 Sep 2023 12:15:32 GMT
- Title: Graphon Estimation in bipartite graphs with observable edge labels and
unobservable node labels
- Authors: Etienne Donier-Meroz, Arnak S. Dalalyan, Francis Kramarz, Philippe
Chon\'e, Xavier D'Haultfoeuille
- Abstract summary: Many real-world data sets can be presented in the form of a matrix whose entries correspond to the interaction between two entities of different natures.
We assume in this paper that the mentioned interaction is determined by unobservable latent variables describing each entity.
Our objective is to estimate the conditional expectation of the data matrix given the unobservable variables.
- Score: 3.59986669039879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many real-world data sets can be presented in the form of a matrix whose
entries correspond to the interaction between two entities of different natures
(number of times a web user visits a web page, a student's grade in a subject,
a patient's rating of a doctor, etc.). We assume in this paper that the
mentioned interaction is determined by unobservable latent variables describing
each entity. Our objective is to estimate the conditional expectation of the
data matrix given the unobservable variables. This is presented as a problem of
estimation of a bivariate function referred to as graphon. We study the cases
of piecewise constant and H\"older-continuous graphons. We establish finite
sample risk bounds for the least squares estimator and the exponentially
weighted aggregate. These bounds highlight the dependence of the estimation
error on the size of the data set, the maximum intensity of the interactions,
and the level of noise. As the analyzed least-squares estimator is intractable,
we propose an adaptation of Lloyd's alternating minimization algorithm to
compute an approximation of the least-squares estimator. Finally, we present
numerical experiments in order to illustrate the empirical performance of the
graphon estimator on synthetic data sets.
Related papers
- Graph-based Complexity for Causal Effect by Empirical Plug-in [56.14597641617531]
This paper focuses on the computational complexity of computing empirical plug-in estimates for causal effect queries.
We show that computation can be done efficiently, potentially in time linear in the data size, depending on the estimand's hypergraph.
arXiv Detail & Related papers (2024-11-15T07:42:01Z) - Node Regression on Latent Position Random Graphs via Local Averaging [10.962094053749093]
We study the performance of various estimators for node regression.
An alternative consists in first estimating the true distances between the latent positions, then injecting these estimated distances into a classical Nadaraya Watson estimator.
We show that this method can achieve standard nonparametric rates in certain instances even when the graph neighborhood is too large or too small.
arXiv Detail & Related papers (2024-10-29T12:17:06Z) - Entry-Specific Matrix Estimation under Arbitrary Sampling Patterns through the Lens of Network Flows [9.631640936820126]
Matrix completion tackles the task of predicting missing values in a low-rank matrix based on a sparse set of observed entries.
We introduce a matrix completion algorithm based on network flows in the bipartite graph induced by the observation pattern.
Our results show that the minimax squared error for recovery of a particular entry in the matrix is proportional to the effective resistance of the corresponding edge in the graph.
arXiv Detail & Related papers (2024-09-06T02:01:03Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Low-Rank Covariance Completion for Graph Quilting with Applications to Functional Connectivity [8.500141848121782]
In many calcium imaging data sets, the full population of neurons is not recorded simultaneously, but instead in partially overlapping blocks.
In this paper, we introduce a novel two-step approach to Graph Quilting, which first imputes the nuclear structure matrix using low-rank co completion.
We show the efficacy of these methods for estimating connectivity from calcium imaging data.
arXiv Detail & Related papers (2022-09-17T08:03:46Z) - Collaborative likelihood-ratio estimation over graphs [55.98760097296213]
Graph-based Relative Unconstrained Least-squares Importance Fitting (GRULSIF)
We develop this idea in a concrete non-parametric method that we call Graph-based Relative Unconstrained Least-squares Importance Fitting (GRULSIF)
We derive convergence rates for our collaborative approach that highlights the role played by variables such as the number of available observations per node, the size of the graph, and how accurately the graph structure encodes the similarity between tasks.
arXiv Detail & Related papers (2022-05-28T15:37:03Z) - Graph-in-Graph (GiG): Learning interpretable latent graphs in
non-Euclidean domain for biological and healthcare applications [52.65389473899139]
Graphs are a powerful tool for representing and analyzing unstructured, non-Euclidean data ubiquitous in the healthcare domain.
Recent works have shown that considering relationships between input data samples have a positive regularizing effect for the downstream task.
We propose Graph-in-Graph (GiG), a neural network architecture for protein classification and brain imaging applications.
arXiv Detail & Related papers (2022-04-01T10:01:37Z) - Rank-one matrix estimation with groupwise heteroskedasticity [5.202966939338455]
We study the problem of estimating a rank-one matrix from Gaussian observations where different blocks of the matrix are observed under different noise levels.
We prove exact formulas for the minimum mean-squared error in estimating both the matrix and the latent variables.
We derive an approximate message passing algorithm and a gradient descent algorithm and show empirically that these algorithms achieve the information-theoretic limits in certain regimes.
arXiv Detail & Related papers (2021-06-22T17:48:36Z) - Interpolation and Learning with Scale Dependent Kernels [91.41836461193488]
We study the learning properties of nonparametric ridge-less least squares.
We consider the common case of estimators defined by scale dependent kernels.
arXiv Detail & Related papers (2020-06-17T16:43:37Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.