Automatic Differentiation of Programs with Discrete Randomness
- URL: http://arxiv.org/abs/2210.08572v2
- Date: Tue, 18 Oct 2022 15:36:17 GMT
- Title: Automatic Differentiation of Programs with Discrete Randomness
- Authors: Gaurav Arya, Moritz Schauer, Frank Sch\"afer, Chris Rackauckas
- Abstract summary: We develop a new reization-based methodology that allows for generating programs whose expectation is the derivative of the expectation of the original program.
We demonstrate unbiased forward-mode AD of discrete-time Markov chains, agent-based models such as Conway's Game of Life, and unbiased reverse-mode AD of a particle filter.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic differentiation (AD), a technique for constructing new programs
which compute the derivative of an original program, has become ubiquitous
throughout scientific computing and deep learning due to the improved
performance afforded by gradient-based optimization. However, AD systems have
been restricted to the subset of programs that have a continuous dependence on
parameters. Programs that have discrete stochastic behaviors governed by
distribution parameters, such as flipping a coin with probability $p$ of being
heads, pose a challenge to these systems because the connection between the
result (heads vs tails) and the parameters ($p$) is fundamentally discrete. In
this paper we develop a new reparameterization-based methodology that allows
for generating programs whose expectation is the derivative of the expectation
of the original program. We showcase how this method gives an unbiased and
low-variance estimator which is as automated as traditional AD mechanisms. We
demonstrate unbiased forward-mode AD of discrete-time Markov chains,
agent-based models such as Conway's Game of Life, and unbiased reverse-mode AD
of a particle filter. Our code is available at
https://github.com/gaurav-arya/StochasticAD.jl.
Related papers
- Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - Accelerated Inference for Partially Observed Markov Processes using Automatic Differentiation [4.872049174955585]
Automatic differentiation (AD) has driven recent advances in machine learning.
We show how to embed two existing AD particle filter methods in a theoretical framework that provides an extension to a new class of algorithms.
We develop likelihood algorithms suited to the Monte Carlo properties of the AD gradient estimate.
arXiv Detail & Related papers (2024-07-03T13:06:46Z) - Smoothing Methods for Automatic Differentiation Across Conditional
Branches [0.0]
Smooth interpretation (SI) approximates the convolution of a program's output with a Gaussian kernel, thus smoothing its output in a principled manner.
We combine SI with automatic differentiation (AD) to efficiently compute gradients of smoothed programs.
We propose a novel Monte Carlo estimator that avoids the underlying assumptions by estimating the smoothed programs' gradients through a combination of AD and sampling.
arXiv Detail & Related papers (2023-10-05T15:08:37Z) - Value-Distributional Model-Based Reinforcement Learning [59.758009422067]
Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks.
We study the problem from a model-based Bayesian reinforcement learning perspective.
We propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function.
arXiv Detail & Related papers (2023-08-12T14:59:19Z) - Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement
Learning: Adaptivity and Computational Efficiency [90.40062452292091]
We present the first computationally efficient algorithm for linear bandits with heteroscedastic noise.
Our algorithm is adaptive to the unknown variance of noise and achieves an $tildeO(d sqrtsum_k = 1K sigma_k2 + d)$ regret.
We also propose a variance-adaptive algorithm for linear mixture Markov decision processes (MDPs) in reinforcement learning.
arXiv Detail & Related papers (2023-02-21T00:17:24Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Stochastically Differentiable Probabilistic Programs [18.971852464650144]
The existence of discrete random variables prohibits many basic gradient-based inference engines.
We present a novel approach to run inference efficiently and robustly in such programs using Markov Chain Monte Carlo family of algorithms.
arXiv Detail & Related papers (2020-03-02T08:04:41Z) - Deep combinatorial optimisation for optimal stopping time problems :
application to swing options pricing [0.0]
A new method for computation control based on neural networks and using randomisation of discrete random variables is proposed and applied to optimal stopping time problems.
The proposed algorithm succeeds in pricing high dimensional American and swing options in a reasonable time, which is not possible with classical algorithms.
arXiv Detail & Related papers (2020-01-30T10:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.