Differentiated uniformization: A new method for inferring Markov chains
on combinatorial state spaces including stochastic epidemic models
- URL: http://arxiv.org/abs/2112.10971v1
- Date: Tue, 21 Dec 2021 03:59:06 GMT
- Title: Differentiated uniformization: A new method for inferring Markov chains
on combinatorial state spaces including stochastic epidemic models
- Authors: Kevin Rupp, Rudolf Schill, Jonas S\"uskind, Peter Georg, Maren Klever,
Andreas L\"osch, Lars Grasedyck, Tilo Wettig, Rainer Spang
- Abstract summary: We provide an analogous algorithm for computing $partialexp!(tQ)theta$.
We estimate monthly infection and recovery rates during the first wave of the COVID-19 pandemic in Austria.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Motivation: We consider continuous-time Markov chains that describe the
stochastic evolution of a dynamical system by a transition-rate matrix $Q$
which depends on a parameter $\theta$. Computing the probability distribution
over states at time $t$ requires the matrix exponential $\exp(tQ)$, and
inferring $\theta$ from data requires its derivative
$\partial\exp\!(tQ)/\partial\theta$. Both are challenging to compute when the
state space and hence the size of $Q$ is huge. This can happen when the state
space consists of all combinations of the values of several interacting
discrete variables. Often it is even impossible to store $Q$. However, when $Q$
can be written as a sum of tensor products, computing $\exp(tQ)$ becomes
feasible by the uniformization method, which does not require explicit storage
of $Q$.
Results: Here we provide an analogous algorithm for computing
$\partial\exp\!(tQ)/\partial\theta$, the differentiated uniformization method.
We demonstrate our algorithm for the stochastic SIR model of epidemic spread,
for which we show that $Q$ can be written as a sum of tensor products. We
estimate monthly infection and recovery rates during the first wave of the
COVID-19 pandemic in Austria and quantify their uncertainty in a full Bayesian
analysis.
Availability: Implementation and data are available at
https://github.com/spang-lab/TenSIR.
Related papers
- Log-Concave Coupling for Sampling Neural Net Posteriors [0.4604003661048266]
We present a sampling algorithm for single hidden layer neural networks.
The algorithm is based on a coupling of the posterior density for $w$ with an auxiliary random variable $xi$.
The score of the marginal density of the auxiliary random variable $xi$ is determined by an expectation over $w|xi$.
arXiv Detail & Related papers (2024-07-26T15:05:41Z) - Identification of Mixtures of Discrete Product Distributions in
Near-Optimal Sample and Time Complexity [6.812247730094931]
We show, for any $ngeq 2k-1$, how to achieve sample complexity and run-time complexity $(1/zeta)O(k)$.
We also extend the known lower bound of $eOmega(k)$ to match our upper bound across a broad range of $zeta$.
arXiv Detail & Related papers (2023-09-25T09:50:15Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - Complexity of High-Dimensional Identity Testing with Coordinate Conditional Sampling [5.3098033683382155]
We study the identity testing problem for high-dimensional distributions.
We consider a significantly weaker conditional sampling oracle, which we call the $mathsfCoordinate Oracle$.
We prove that if an analytic property known as approximate tensorization of entropy holds for an $n$-dimensional visible distribution $mu$, then there is an efficient identity testing algorithm for any hidden distribution $pi$ using $tildeO(n/varepsilon)$ queries.
arXiv Detail & Related papers (2022-07-19T06:49:24Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$.
We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms.
We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling"
We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.