Related papers: AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions

AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions

URL: http://arxiv.org/abs/2301.08838v1
Date: Sat, 21 Jan 2023 00:40:21 GMT
Title: AQuaMaM: An Autoregressive, Quaternion Manifold Model for Rapidly Estimating Complex SO(3) Distributions
Authors: Michael A. Alcorn
Abstract summary: AQuaMaM is a neural network capable of both learning complex distributions on the rotation manifold and calculating exact likelihoods for query rotations in a single forward pass. When trained on a constructed dataset of 500,000 renders of a die in different rotations, AQuaMaM reaches a test log-likelihood 14% higher than IPDF. Compared to IPDF, AQuaMaM uses 24% fewer parameters, has a prediction throughput 52$times$ faster on a single GPU, and converges in a similar amount of time during training.
Score: 0.6526824510982799
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurately modeling complex, multimodal distributions is necessary for optimal decision-making, but doing so for rotations in three-dimensions, i.e., the SO(3) group, is challenging due to the curvature of the rotation manifold. The recently described implicit-PDF (IPDF) is a simple, elegant, and effective approach for learning arbitrary distributions on SO(3) up to a given precision. However, inference with IPDF requires $N$ forward passes through the network's final multilayer perceptron (where $N$ places an upper bound on the likelihood that can be calculated by the model), which is prohibitively slow for those without the computational resources necessary to parallelize the queries. In this paper, I introduce AQuaMaM, a neural network capable of both learning complex distributions on the rotation manifold and calculating exact likelihoods for query rotations in a single forward pass. Specifically, AQuaMaM autoregressively models the projected components of unit quaternions as mixtures of uniform distributions that partition their geometrically-restricted domain of values. When trained on an "infinite" toy dataset with ambiguous viewpoints, AQuaMaM rapidly converges to a sampling distribution closely matching the true data distribution. In contrast, the sampling distribution for IPDF dramatically diverges from the true data distribution, despite IPDF approaching its theoretical minimum evaluation loss during training. When trained on a constructed dataset of 500,000 renders of a die in different rotations, AQuaMaM reaches a test log-likelihood 14% higher than IPDF. Further, compared to IPDF, AQuaMaM uses 24% fewer parameters, has a prediction throughput 52$\times$ faster on a single GPU, and converges in a similar amount of time during training.

Related papers

Improved Convergence Rate for Diffusion Probabilistic Models [7.237817437521988]
Score-based diffusion models have achieved remarkable empirical performance in the field of machine learning and artificial intelligence. Despite a lot of theoretical attempts, there still exists significant gap between theory and practice. We establish an iteration complexity at the order of $d2/3varepsilon-2/3$, which is better than $d5/12varepsilon-1$. Our theory accommodates $varepsilon$-accurate score estimates, and does not require log-concavity on the target distribution.
arXiv Detail & Related papers (2024-10-17T16:37:33Z)
Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity [11.71206628091551]
Diffusion models are costly to train and evaluate, reducing the inference cost for diffusion models remains a major goal. Inspired by the recent empirical success in accelerating diffusion models via the parallel sampling techniqueciteshih2024parallel, we propose to divide the sampling process into $mathcalO(1)$ blocks with parallelizable Picard iterations within each block. Our results shed light on the potential of fast and efficient sampling of high-dimensional data on fast-evolving modern large-memory GPU clusters.
arXiv Detail & Related papers (2024-05-24T23:59:41Z)
Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach [49.97755400231656]
We show that a novel accelerated DDPM sampler achieves accelerated performance for three broad distribution classes not considered before. Our results show an improved dependency on the data dimension $d$ among accelerated DDPM type samplers.
arXiv Detail & Related papers (2024-02-21T16:11:47Z)
Comparative Study of Coupling and Autoregressive Flows through Robust Statistical Tests [0.0]
We propose an in-depth comparison of coupling and autoregressive flows, both of the affine and rational quadratic type. We focus on a set of multimodal target distributions increasing dimensionality ranging from 4 to 400. Our results indicate that the A-RQS algorithm stands out both in terms of accuracy and training speed.
arXiv Detail & Related papers (2023-02-23T13:34:01Z)
Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions. To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits. In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z)
Unsupervised Learning of Sampling Distributions for Particle Filters [80.6716888175925]
We put forward four methods for learning sampling distributions from observed measurements. Experiments demonstrate that learned sampling distributions exhibit better performance than designed, minimum-degeneracy sampling distributions.
arXiv Detail & Related papers (2023-02-02T15:50:21Z)
On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms. We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity. Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z)
A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest [0.0]
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning algorithm. We use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware.
arXiv Detail & Related papers (2020-12-10T19:00:15Z)
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator) We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$. We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)
SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
SURF is an algorithm for approximating distributions by piecewises. It outperforms state-of-the-art algorithms in experiments.
arXiv Detail & Related papers (2020-02-22T01:03:33Z)
Gravitational-wave parameter estimation with autoregressive neural network flows [0.0]
We introduce the use of autoregressive normalizing flows for rapid likelihood-free inference of binary black hole system parameters from gravitational-wave data with deep neural networks. A normalizing flow is an invertible mapping on a sample space that can be used to induce a transformation from a simple probability distribution to a more complex one. We build a more powerful latent variable model by incorporating autoregressive flows within the variational autoencoder framework.
arXiv Detail & Related papers (2020-02-18T15:44:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.