Self-Attention Amortized Distributional Projection Optimization for
Sliced Wasserstein Point-Cloud Reconstruction
- URL: http://arxiv.org/abs/2301.04791v2
- Date: Mon, 8 May 2023 17:11:04 GMT
- Title: Self-Attention Amortized Distributional Projection Optimization for
Sliced Wasserstein Point-Cloud Reconstruction
- Authors: Khai Nguyen and Dang Nguyen and Nhat Ho
- Abstract summary: Max sliced Wasserstein (Max-SW) distance has been widely known as a solution for less discriminative projections.
We propose to replace Max-SW with distributional sliced Wasserstein distance with von Mises-Fisher (vMF) projecting distribution.
- Score: 17.67599778907391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Max sliced Wasserstein (Max-SW) distance has been widely known as a solution
for less discriminative projections of sliced Wasserstein (SW) distance. In
applications that have various independent pairs of probability measures,
amortized projection optimization is utilized to predict the ``max" projecting
directions given two input measures instead of using projected gradient ascent
multiple times. Despite being efficient, Max-SW and its amortized version
cannot guarantee metricity property due to the sub-optimality of the projected
gradient ascent and the amortization gap. Therefore, we propose to replace
Max-SW with distributional sliced Wasserstein distance with von Mises-Fisher
(vMF) projecting distribution (v-DSW). Since v-DSW is a metric with any
non-degenerate vMF distribution, its amortized version can guarantee the
metricity when performing amortization. Furthermore, current amortized models
are not permutation invariant and symmetric. To address the issue, we design
amortized models based on self-attention architecture. In particular, we adopt
efficient self-attention architectures to make the computation linear in the
number of supports. With the two improvements, we derive self-attention
amortized distributional projection optimization and show its appealing
performance in point-cloud reconstruction and its downstream applications.
Related papers
- Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models.
We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z) - Sliced Wasserstein with Random-Path Projecting Directions [49.802024788196434]
We propose an optimization-free slicing distribution that provides a fast sampling for the Monte Carlo estimation of expectation.
We derive the random-path slicing distribution (RPSD) and two variants of sliced Wasserstein, i.e., the Random-Path Projection Sliced Wasserstein (RPSW) and the Importance Weighted Random-Path Projection Sliced Wasserstein (IWRPSW)
arXiv Detail & Related papers (2024-01-29T04:59:30Z) - Energy-Based Sliced Wasserstein Distance [47.18652387199418]
A key component of the sliced Wasserstein (SW) distance is the slicing distribution.
We propose to design the slicing distribution as an energy-based distribution that is parameter-free.
We then derive a novel sliced Wasserstein metric, energy-based sliced Waserstein (EBSW) distance.
arXiv Detail & Related papers (2023-04-26T14:28:45Z) - Markovian Sliced Wasserstein Distances: Beyond Independent Projections [51.80527230603978]
We introduce a new family of SW distances, named Markovian sliced Wasserstein (MSW) distance, which imposes a first-order Markov structure on projecting directions.
We compare distances with previous SW variants in various applications such as flows, color transfer, and deep generative modeling to demonstrate the favorable performance of MSW.
arXiv Detail & Related papers (2023-01-10T01:58:15Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z) - Amortized Projection Optimization for Sliced Wasserstein Generative
Models [17.196369579631074]
We propose to utilize the learning-to-optimize technique or amortized optimization to predict the informative direction of any given two mini-batch probability measures.
To the best of our knowledge, this is the first work that bridges amortized optimization and sliced Wasserstein generative models.
arXiv Detail & Related papers (2022-03-25T02:08:51Z) - Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled
Markov Chains [34.77971292478243]
The variational auto-encoder (VAE) is a deep latent variable model that has two neural networks in an autoencoder-like architecture.
We develop a training scheme for VAEs by introducing unbiased estimators of the log-likelihood gradient.
We show experimentally that VAEs fitted with unbiased estimators exhibit better predictive performance.
arXiv Detail & Related papers (2020-10-05T08:11:55Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Distributional Sliced-Wasserstein and Applications to Generative
Modeling [27.014748003733544]
Sliced-Wasserstein distance (SW) and its variant, Max Sliced-Wasserstein distance (Max-SW) have been used widely in the recent years.
We propose a novel distance, named Distributional Sliced-Wasserstein distance (DSW)
We show that the DSW is a generalization of Max-SW, and it can be computed efficiently by searching for the optimal push-forward measure.
arXiv Detail & Related papers (2020-02-18T04:35:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.