Densities of Almost Surely Terminating Probabilistic Programs are
Differentiable Almost Everywhere
- URL: http://arxiv.org/abs/2004.03924v2
- Date: Mon, 21 Jun 2021 16:00:29 GMT
- Title: Densities of Almost Surely Terminating Probabilistic Programs are
Differentiable Almost Everywhere
- Authors: Carol Mak, C.-H. Luke Ong, Hugo Paquet and Dominik Wagner
- Abstract summary: We study the differential properties of higher-order statistical probabilistic programs with recursion and conditioning.
A by-product of this work is that almost-surely terminating deterministic (S)PCF programs with real parameters denote functions that are almost-everywhere differentiability.
- Score: 1.911678487931003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the differential properties of higher-order statistical
probabilistic programs with recursion and conditioning. Our starting point is
an open problem posed by Hongseok Yang: what class of statistical probabilistic
programs have densities that are differentiable almost everywhere? To formalise
the problem, we consider Statistical PCF (SPCF), an extension of call-by-value
PCF with real numbers, and constructs for sampling and conditioning. We give
SPCF a sampling-style operational semantics a la Borgstrom et al., and study
the associated weight (commonly referred to as the density) function and value
function on the set of possible execution traces. Our main result is that
almost-surely terminating SPCF programs, generated from a set of primitive
functions (e.g. the set of analytic functions) satisfying mild closure
properties, have weight and value functions that are almost-everywhere
differentiable. We use a stochastic form of symbolic execution to reason about
almost-everywhere differentiability. A by-product of this work is that
almost-surely terminating deterministic (S)PCF programs with real parameters
denote functions that are almost-everywhere differentiable. Our result is of
practical interest, as almost-everywhere differentiability of the density
function is required to hold for the correctness of major gradient-based
inference algorithms.
Related papers
- Generalizing Stochastic Smoothing for Differentiation and Gradient Estimation [59.86921150579892]
We deal with the problem of gradient estimation for differentiable relaxations of algorithms, operators, simulators, and other non-differentiable functions.
We develop variance reduction strategies for differentiable sorting and ranking, differentiable shortest-paths on graphs, differentiable rendering for pose estimation, as well as differentiable cryo-ET simulations.
arXiv Detail & Related papers (2024-10-10T17:10:00Z) - Cumulative Hazard Function Based Efficient Multivariate Temporal Point Process Learning [0.0]
In this paper, we explore using neural networks to model a flexible but well-defined CHF.
We show that the proposed model achieves the state-of-the-art performance on data fitting and event prediction tasks.
arXiv Detail & Related papers (2024-04-21T13:51:31Z) - Statistical Inference of Optimal Allocations I: Regularities and their Implications [3.904240476752459]
We first derive Hadamard differentiability of the value function through a detailed analysis of the general properties of the sorting operator.
Building on our Hadamard differentiability results, we demonstrate how the functional delta method can be used to directly derive the properties of the value function process.
arXiv Detail & Related papers (2024-03-27T04:39:13Z) - $\omega$PAP Spaces: Reasoning Denotationally About Higher-Order,
Recursive Probabilistic and Differentiable Programs [64.25762042361839]
$omega$PAP spaces are spaces for reasoning denotationally about expressive differentiable and probabilistic programming languages.
Our semantics is general enough to assign meanings to most practical probabilistic and differentiable programs.
We establish the almost-everywhere differentiability of probabilistic programs' trace density functions.
arXiv Detail & Related papers (2023-02-21T12:50:05Z) - The Normalized Cross Density Functional: A Framework to Quantify
Statistical Dependence for Random Processes [6.625320950808605]
We present a novel approach to measuring statistical dependence between two random processes (r.p.) using a positive-definite function called the Normalized Cross Density (NCD)
NCD is derived directly from the probability density functions of two r.p. and constructs a data-dependent Hilbert space, the Normalized Cross-Density Hilbert Space (NCD-HS)
We mathematically prove that FMCA learns the dominant eigenvalues and eigenfunctions of NCD directly from realizations.
arXiv Detail & Related papers (2022-12-09T02:12:41Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Inference on Strongly Identified Functionals of Weakly Identified
Functions [71.42652863687117]
We study a novel condition for the functional to be strongly identified even when the nuisance function is not.
We propose penalized minimax estimators for both the primary and debiasing nuisance functions.
arXiv Detail & Related papers (2022-08-17T13:38:31Z) - Data-Driven Reachability analysis and Support set Estimation with
Christoffel Functions [8.183446952097528]
We present algorithms for estimating the forward reachable set of a dynamical system.
The produced estimate is the sublevel set of a function called an empirical inverse Christoffel function.
In addition to reachability analysis, the same approach can be applied to general problems of estimating the support of a random variable.
arXiv Detail & Related papers (2021-12-18T20:25:34Z) - Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z) - Piecewise Linear Regression via a Difference of Convex Functions [50.89452535187813]
We present a new piecewise linear regression methodology that utilizes fitting a difference of convex functions (DC functions) to the data.
We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing regression/classification methods on real-world datasets.
arXiv Detail & Related papers (2020-07-05T18:58:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.