AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly
Estimating Complex SO(3) Distributions
- URL: http://arxiv.org/abs/2301.08838v1
- Date: Sat, 21 Jan 2023 00:40:21 GMT
- Title: AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly
Estimating Complex SO(3) Distributions
- Authors: Michael A. Alcorn
- Abstract summary: AQuaMaM is a neural network capable of both learning complex distributions on the rotation manifold and calculating exact likelihoods for query rotations in a single forward pass.
When trained on a constructed dataset of 500,000 renders of a die in different rotations, AQuaMaM reaches a test log-likelihood 14% higher than IPDF.
Compared to IPDF, AQuaMaM uses 24% fewer parameters, has a prediction throughput 52$times$ faster on a single GPU, and converges in a similar amount of time during training.
- Score: 0.6526824510982799
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurately modeling complex, multimodal distributions is necessary for
optimal decision-making, but doing so for rotations in three-dimensions, i.e.,
the SO(3) group, is challenging due to the curvature of the rotation manifold.
The recently described implicit-PDF (IPDF) is a simple, elegant, and effective
approach for learning arbitrary distributions on SO(3) up to a given precision.
However, inference with IPDF requires $N$ forward passes through the network's
final multilayer perceptron (where $N$ places an upper bound on the likelihood
that can be calculated by the model), which is prohibitively slow for those
without the computational resources necessary to parallelize the queries. In
this paper, I introduce AQuaMaM, a neural network capable of both learning
complex distributions on the rotation manifold and calculating exact
likelihoods for query rotations in a single forward pass. Specifically, AQuaMaM
autoregressively models the projected components of unit quaternions as
mixtures of uniform distributions that partition their geometrically-restricted
domain of values. When trained on an "infinite" toy dataset with ambiguous
viewpoints, AQuaMaM rapidly converges to a sampling distribution closely
matching the true data distribution. In contrast, the sampling distribution for
IPDF dramatically diverges from the true data distribution, despite IPDF
approaching its theoretical minimum evaluation loss during training. When
trained on a constructed dataset of 500,000 renders of a die in different
rotations, AQuaMaM reaches a test log-likelihood 14% higher than IPDF. Further,
compared to IPDF, AQuaMaM uses 24% fewer parameters, has a prediction
throughput 52$\times$ faster on a single GPU, and converges in a similar amount
of time during training.
Related papers
- Improved Convergence Rate for Diffusion Probabilistic Models [7.237817437521988]
Score-based diffusion models have achieved remarkable empirical performance in the field of machine learning and artificial intelligence.
Despite a lot of theoretical attempts, there still exists significant gap between theory and practice.
We establish an iteration complexity at the order of $d2/3varepsilon-2/3$, which is better than $d5/12varepsilon-1$.
Our theory accommodates $varepsilon$-accurate score estimates, and does not require log-concavity on the target distribution.
arXiv Detail & Related papers (2024-10-17T16:37:33Z) - Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity [11.71206628091551]
Diffusion models are costly to train and evaluate, reducing the inference cost for diffusion models remains a major goal.
Inspired by the recent empirical success in accelerating diffusion models via the parallel sampling techniqueciteshih2024parallel, we propose to divide the sampling process into $mathcalO(1)$ blocks with parallelizable Picard iterations within each block.
Our results shed light on the potential of fast and efficient sampling of high-dimensional data on fast-evolving modern large-memory GPU clusters.
arXiv Detail & Related papers (2024-05-24T23:59:41Z) - Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach [49.97755400231656]
We show that a novel accelerated DDPM sampler achieves accelerated performance for three broad distribution classes not considered before.
Our results show an improved dependency on the data dimension $d$ among accelerated DDPM type samplers.
arXiv Detail & Related papers (2024-02-21T16:11:47Z) - Comparative Study of Coupling and Autoregressive Flows through Robust
Statistical Tests [0.0]
We propose an in-depth comparison of coupling and autoregressive flows, both of the affine and rational quadratic type.
We focus on a set of multimodal target distributions increasing dimensionality ranging from 4 to 400.
Our results indicate that the A-RQS algorithm stands out both in terms of accuracy and training speed.
arXiv Detail & Related papers (2023-02-23T13:34:01Z) - Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions.
To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits.
In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Unsupervised Learning of Sampling Distributions for Particle Filters [80.6716888175925]
We put forward four methods for learning sampling distributions from observed measurements.
Experiments demonstrate that learned sampling distributions exhibit better performance than designed, minimum-degeneracy sampling distributions.
arXiv Detail & Related papers (2023-02-02T15:50:21Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - A machine learning approach to galaxy properties: joint redshift-stellar
mass probability distributions with Random Forest [0.0]
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning algorithm.
We use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses.
In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware.
arXiv Detail & Related papers (2020-12-10T19:00:15Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
SURF is an algorithm for approximating distributions by piecewises.
It outperforms state-of-the-art algorithms in experiments.
arXiv Detail & Related papers (2020-02-22T01:03:33Z) - Gravitational-wave parameter estimation with autoregressive neural
network flows [0.0]
We introduce the use of autoregressive normalizing flows for rapid likelihood-free inference of binary black hole system parameters from gravitational-wave data with deep neural networks.
A normalizing flow is an invertible mapping on a sample space that can be used to induce a transformation from a simple probability distribution to a more complex one.
We build a more powerful latent variable model by incorporating autoregressive flows within the variational autoencoder framework.
arXiv Detail & Related papers (2020-02-18T15:44:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.