Related papers: Disentangled Representation Learning through Geometry Preservation with the Gromov-Monge Gap

Disentangled Representation Learning through Geometry Preservation with the Gromov-Monge Gap

URL: http://arxiv.org/abs/2407.07829v1
Date: Wed, 10 Jul 2024 16:51:32 GMT
Title: Disentangled Representation Learning through Geometry Preservation with the Gromov-Monge Gap
Authors: Théo Uscidda, Luca Eyring, Karsten Roth, Fabian Theis, Zeynep Akata, Marco Cuturi,
Abstract summary: Learning disentangled representations in an unsupervised manner is a fundamental challenge in machine learning. We propose a novel perspective on disentangled representation learning built on quadratic optimal transport. We show that geometry preservation can even encourage unsupervised disentanglement without the standard reconstruction objective.
Score: 65.73194652234848
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning disentangled representations in an unsupervised manner is a fundamental challenge in machine learning. Solving it may unlock other problems, such as generalization, interpretability, or fairness. While remarkably difficult to solve in general, recent works have shown that disentanglement is provably achievable under additional assumptions that can leverage geometrical constraints, such as local isometry. To use these insights, we propose a novel perspective on disentangled representation learning built on quadratic optimal transport. Specifically, we formulate the problem in the Gromov-Monge setting, which seeks isometric mappings between distributions supported on different spaces. We propose the Gromov-Monge-Gap (GMG), a regularizer that quantifies the geometry-preservation of an arbitrary push-forward map between two distributions supported on different spaces. We demonstrate the effectiveness of GMG regularization for disentanglement on four standard benchmarks. Moreover, we show that geometry preservation can even encourage unsupervised disentanglement without the standard reconstruction objective - making the underlying model decoder-free, and promising a more practically viable and scalable perspective on unsupervised disentanglement.

Related papers

Scalable Hypergraph Structure Learning with Diverse Smoothness Priors [7.559720049837459]
We propose a novel hypergraph learning method that recovers a hypergraph from time-series signals based on a smoothness prior. We show improved performance, in terms of accuracy, over other state-of-the-art hypergraph inference methods.
arXiv Detail & Related papers (2025-04-04T16:47:30Z)
Geometrically Aligned Transfer Encoder for Inductive Transfer in Regression Tasks [5.038936775643437]
We propose a novel transfer technique based on differential geometry, namely the Geometrically Aligned Transfer (GATE) We find a proper diffeomorphism between pairs of tasks to ensure that every arbitrary point maps to a locally flat coordinate in the overlapping region, allowing the transfer of knowledge from the source to the target data. GATE outperforms conventional methods and exhibits stable behavior in both the latent space and extrapolation regions for various molecular graph datasets.
arXiv Detail & Related papers (2023-10-10T07:11:25Z)
Joint graph learning from Gaussian observations in the presence of hidden nodes [26.133725549667734]
We propose a joint graph learning method that takes into account the presence of hidden (latent) variables. We exploit the structure resulting from the previous considerations to propose a convex optimization problem. We compare the proposed algorithm with different baselines and evaluate its performance over synthetic and real-world graphs.
arXiv Detail & Related papers (2022-12-04T13:03:41Z)
Unveiling the Sampling Density in Non-Uniform Geometric Graphs [69.93864101024639]
We consider graphs as geometric graphs: nodes are randomly sampled from an underlying metric space, and any pair of nodes is connected if their distance is less than a specified neighborhood radius. In a social network communities can be modeled as densely sampled areas, and hubs as nodes with larger neighborhood radius. We develop methods to estimate the unknown sampling density in a self-supervised fashion.
arXiv Detail & Related papers (2022-10-15T08:01:08Z)
Geometry Contrastive Learning on Heterogeneous Graphs [50.58523799455101]
This paper proposes a novel self-supervised learning method, termed as Geometry Contrastive Learning (GCL) GCL views a heterogeneous graph from Euclidean and hyperbolic perspective simultaneously, aiming to make a strong merger of the ability of modeling rich semantics and complex structures. Extensive experiments on four benchmarks data sets show that the proposed approach outperforms the strong baselines.
arXiv Detail & Related papers (2022-06-25T03:54:53Z)
Semi-supervised Dense Keypoints Using Unlabeled Multiview Images [22.449168666514677]
This paper presents a new end-to-end semi-supervised framework to learn a dense keypoint detector using unlabeled multiview images. A key challenge lies in finding the exact correspondences between the dense keypoints in multiple views. We derive a new probabilistic epipolar constraint that encodes the two desired properties.
arXiv Detail & Related papers (2021-09-20T04:57:57Z)
Finding Geometric Models by Clustering in the Consensus Space [61.65661010039768]
We propose a new algorithm for finding an unknown number of geometric models, e.g., homographies. We present a number of applications where the use of multiple geometric models improves accuracy. These include pose estimation from multiple generalized homographies; trajectory estimation of fast-moving objects.
arXiv Detail & Related papers (2021-03-25T14:35:07Z)
Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences. We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline. Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z)
Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian. The graph reasoning is directly performed in the original feature space organized as a spatial pyramid. We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)
Learning Flat Latent Manifolds with VAEs [16.725880610265378]
We propose an extension to the framework of variational auto-encoders, where the Euclidean metric is a proxy for the similarity between data points. We replace the compact prior typically used in variational auto-encoders with a recently presented, more expressive hierarchical one. We evaluate our method on a range of data-sets, including a video-tracking benchmark.
arXiv Detail & Related papers (2020-02-12T09:54:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.