Orthogonal Gromov-Wasserstein Discrepancy with Efficient Lower Bound
- URL: http://arxiv.org/abs/2205.05838v1
- Date: Thu, 12 May 2022 02:10:56 GMT
- Title: Orthogonal Gromov-Wasserstein Discrepancy with Efficient Lower Bound
- Authors: Hongwei Jin, Zishun Yu, Xinhua Zhang
- Abstract summary: The Gromov-Wasserstein (GW) formulates a node graph between the structured data based on optimal transportation datasets.
We take inspiration from the connection with the assignment problem, and propose the Gromov-Wasserstein discrepancy as a surrogate of GW.
- Score: 11.086440815804226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Comparing structured data from possibly different metric-measure spaces is a
fundamental task in machine learning, with applications in, e.g., graph
classification. The Gromov-Wasserstein (GW) discrepancy formulates a coupling
between the structured data based on optimal transportation, tackling the
incomparability between different structures by aligning the intra-relational
geometries. Although efficient local solvers such as conditional gradient and
Sinkhorn are available, the inherent non-convexity still prevents a tractable
evaluation, and the existing lower bounds are not tight enough for practical
use. To address this issue, we take inspiration from the connection with the
quadratic assignment problem, and propose the orthogonal Gromov-Wasserstein
(OGW) discrepancy as a surrogate of GW. It admits an efficient and closed-form
lower bound with the complexity of $\mathcal{O}(n^3)$, and directly extends to
the fused Gromov-Wasserstein (FGW) distance, incorporating node features into
the coupling. Extensive experiments on both the synthetic and real-world
datasets show the tightness of our lower bounds, and both OGW and its lower
bounds efficiently deliver accurate predictions and satisfactory barycenters
for graph sets.
Related papers
- Combining Optimal Transport and Embedding-Based Approaches for More Expressiveness in Unsupervised Graph Alignment [19.145556156889064]
Unsupervised graph alignment finds the one-to-one node correspondence between a pair of attributed graphs by only exploiting graph structure and node features.
We propose a principled approach to combine their advantages motivated by theoretical analysis of model expressiveness.
We are the first to guarantee the one-to-one matching constraint by reducing the problem to maximum weight matching.
arXiv Detail & Related papers (2024-06-19T04:57:35Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Outlier-Robust Gromov-Wasserstein for Graph Data [31.895380224961464]
We introduce a new and robust version of the Gromov-Wasserstein (GW) distance called RGW.
RGW features optimistically perturbed marginal constraints within a Kullback-Leibler divergence-based ambiguity set.
We demonstrate the effectiveness of RGW on real-world graph learning tasks, such as subgraph matching and partial shape correspondence.
arXiv Detail & Related papers (2023-02-09T12:57:29Z) - Efficient Graph Field Integrators Meet Point Clouds [59.27295475120132]
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds.
The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), uses popular epsilon-nearest-neighbor graph representations for point clouds.
arXiv Detail & Related papers (2023-02-02T08:33:36Z) - Asymmetric Transfer Hashing with Adaptive Bipartite Graph Learning [95.54688542786863]
Existing hashing methods assume that the query and retrieval samples lie in homogeneous feature space within the same domain.
We propose an Asymmetric Transfer Hashing (ATH) framework with its unsupervised/semi-supervised/supervised realizations.
By jointly optimizing asymmetric hash functions and the bipartite graph, not only can knowledge transfer be achieved but information loss caused by feature alignment can also be avoided.
arXiv Detail & Related papers (2022-06-25T08:24:34Z) - Robust Training in High Dimensions via Block Coordinate Geometric Median
Descent [69.47594803719333]
Geometric median (textGm) is a classical method in statistics for achieving a robust estimation of the uncorrupted data.
In this paper, we show that by that by applying textscGm to only a chosen block of coordinates at a time, one can retain a breakdown point of 0.5 judiciously for smooth nontext problems.
arXiv Detail & Related papers (2021-06-16T15:55:50Z) - Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and
Costs [45.87981728307819]
The ability to compare and align related datasets living in heterogeneous spaces plays an increasingly important role in machine learning.
The Gromov-Wasserstein (GW) formalism can help tackle this problem.
arXiv Detail & Related papers (2021-06-02T12:50:56Z) - Quantized Gromov-Wasserstein [10.592277756185046]
Quantized Gromov Wasserstein (qGW) is a metric that treats parts as fundamental objects and fits into a hierarchy of theoretical upper bounds for the problem.
We develop an algorithm for approximating optimal GW matchings which yields algorithmic speedups and reductions in memory complexity.
We are able to go beyond outperforming state-of-the-art and apply GW matching at scales that are an order of magnitude larger than in the existing literature.
arXiv Detail & Related papers (2021-04-05T17:03:20Z) - Hogwild! over Distributed Local Data Sets with Linearly Increasing
Mini-Batch Sizes [26.9902939745173]
Hogwild! implements asynchronous Gradient Descent where multiple threads in parallel access a common repository containing training data.
We show how local compute nodes can start choosing small mini-batch sizes which increase to larger ones in order to reduce communication cost.
arXiv Detail & Related papers (2020-10-27T03:46:09Z) - Towards Deeper Graph Neural Networks with Differentiable Group
Normalization [61.20639338417576]
Graph neural networks (GNNs) learn the representation of a node by aggregating its neighbors.
Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases.
We introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN)
arXiv Detail & Related papers (2020-06-12T07:18:02Z) - Fast and Robust Comparison of Probability Measures in Heterogeneous
Spaces [62.35667646858558]
We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations.
Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic.
We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
arXiv Detail & Related papers (2020-02-05T03:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.