Aligning Embeddings and Geometric Random Graphs: Informational Results and Computational Approaches for the Procrustes-Wasserstein Problem
- URL: http://arxiv.org/abs/2405.14532v1
- Date: Thu, 23 May 2024 13:18:51 GMT
- Title: Aligning Embeddings and Geometric Random Graphs: Informational Results and Computational Approaches for the Procrustes-Wasserstein Problem
- Authors: Mathieu Even, Luca Ganassali, Jakob Maier, Laurent Massoulié,
- Abstract summary: The Procrustes-Wstein problem consists in matching two high-dimensional point clouds in an unsupervised setting.
We consider a planted model with two datasets $X,Y$ that consist of $n$ datapoints in $mathbbRd$, where $Y$ is a noisy version of $X$.
We propose the Ping-Passerong algorithm alternatively estimating the transformation and the relabeling, via a Franke-Wolfe convex relaxation.
- Score: 12.629532305482423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Procrustes-Wasserstein problem consists in matching two high-dimensional point clouds in an unsupervised setting, and has many applications in natural language processing and computer vision. We consider a planted model with two datasets $X,Y$ that consist of $n$ datapoints in $\mathbb{R}^d$, where $Y$ is a noisy version of $X$, up to an orthogonal transformation and a relabeling of the data points. This setting is related to the graph alignment problem in geometric models. In this work, we focus on the euclidean transport cost between the point clouds as a measure of performance for the alignment. We first establish information-theoretic results, in the high ($d \gg \log n$) and low ($d \ll \log n$) dimensional regimes. We then study computational aspects and propose the Ping-Pong algorithm, alternatively estimating the orthogonal transformation and the relabeling, initialized via a Franke-Wolfe convex relaxation. We give sufficient conditions for the method to retrieve the planted signal after one single step. We provide experimental results to compare the proposed approach with the state-of-the-art method of Grave et al. (2019).
Related papers
- Information-Theoretic Thresholds for Planted Dense Cycles [52.076657911275525]
We study a random graph model for small-world networks which are ubiquitous in social and biological sciences.
For both detection and recovery of the planted dense cycle, we characterize the information-theoretic thresholds in terms of $n$, $tau$, and an edge-wise signal-to-noise ratio $lambda$.
arXiv Detail & Related papers (2024-02-01T03:39:01Z) - Registration of algebraic varieties using Riemannian optimization [0.0]
We consider the problem of finding a transformation between two point clouds that represent the same object but are expressed in different coordinate systems.
Our approach is not based on a point-to-point correspondence, matching every point in the source point cloud to a point in the target point cloud.
arXiv Detail & Related papers (2024-01-16T18:47:38Z) - Sparse Gaussian Graphical Models with Discrete Optimization:
Computational and Statistical Perspectives [8.403841349300103]
We consider the problem of learning a sparse graph underlying an undirected Gaussian graphical model.
We propose GraphL0BnB, a new estimator based on an $ell_0$-penalized version of the pseudolikelihood function.
Our numerical experiments on real/synthetic datasets suggest that our method can solve, to near-optimality, problem instances with $p = 104$.
arXiv Detail & Related papers (2023-07-18T15:49:02Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Detection-Recovery Gap for Planted Dense Cycles [72.4451045270967]
We consider a model where a dense cycle with expected bandwidth $n tau$ and edge density $p$ is planted in an ErdHos-R'enyi graph $G(n,q)$.
We characterize the computational thresholds for the associated detection and recovery problems for the class of low-degree algorithms.
arXiv Detail & Related papers (2023-02-13T22:51:07Z) - Planted Bipartite Graph Detection [13.95780443241133]
We consider the task of detecting a hidden bipartite subgraph in a given random graph.
Under the null hypothesis, the graph is a realization of an ErdHosR'enyi random graph over $n$ with edge density $q$.
Under the alternative, there exists a planted $k_mathsfR times k_mathsfL$ bipartite subgraph with edge density $p>q$.
arXiv Detail & Related papers (2023-02-07T18:18:17Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth.
We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z) - Learning Sparse Graph with Minimax Concave Penalty under Gaussian Markov
Random Fields [51.07460861448716]
This paper presents a convex-analytic framework to learn from data.
We show that a triangular convexity decomposition is guaranteed by a transform of the corresponding to its upper part.
arXiv Detail & Related papers (2021-09-17T17:46:12Z) - $\ell_2$-norm Flow Diffusion in Near-Linear Time [18.263667346853698]
We provide an $widetildeO(m)$-time randomized algorithm for the $ell$-norm flow diffusion problem.
It is done simply by sweeping over the dual solution vector.
It adapts the high-level algorithmic framework of Laplacian system solvers, but requires several new tools.
arXiv Detail & Related papers (2021-05-30T21:27:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.