Bayesian Learning via Q-Exponential Process
- URL: http://arxiv.org/abs/2210.07987v3
- Date: Thu, 16 Nov 2023 01:54:31 GMT
- Title: Bayesian Learning via Q-Exponential Process
- Authors: Shuyi Li, Michael O'Connor, and Shiwei Lan
- Abstract summary: Regularization is one of the most fundamental topics in optimization, statistics and machine learning.
In this work, we generalize the $q$-exponential distribution (with density proportional to) $exp( frac12|u|q)$ to a process named $Q$-exponential (Q-EP) process that corresponds to the $L_q$ regularization of functions.
- Score: 10.551294837978363
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Regularization is one of the most fundamental topics in optimization,
statistics and machine learning. To get sparsity in estimating a parameter
$u\in\mathbb{R}^d$, an $\ell_q$ penalty term, $\Vert u\Vert_q$, is usually
added to the objective function. What is the probabilistic distribution
corresponding to such $\ell_q$ penalty? What is the correct stochastic process
corresponding to $\Vert u\Vert_q$ when we model functions $u\in L^q$? This is
important for statistically modeling large dimensional objects, e.g. images,
with penalty to preserve certainty properties, e.g. edges in the image. In this
work, we generalize the $q$-exponential distribution (with density proportional
to) $\exp{(- \frac{1}{2}|u|^q)}$ to a stochastic process named $Q$-exponential
(Q-EP) process that corresponds to the $L_q$ regularization of functions. The
key step is to specify consistent multivariate $q$-exponential distributions by
choosing from a large family of elliptic contour distributions. The work is
closely related to Besov process which is usually defined by the expanded
series. Q-EP can be regarded as a definition of Besov process with explicit
probabilistic formulation and direct control on the correlation length. From
the Bayesian perspective, Q-EP provides a flexible prior on functions with
sharper penalty ($q<2$) than the commonly used Gaussian process (GP). We
compare GP, Besov and Q-EP in modeling functional data, reconstructing images,
and solving inverse problems and demonstrate the advantage of our proposed
methodology.
Related papers
- Transfer Q Star: Principled Decoding for LLM Alignment [105.89114186982972]
Transfer $Q*$ estimates the optimal value function for a target reward $r$ through a baseline model.
Our approach significantly reduces the sub-optimality gap observed in prior SoTA methods.
arXiv Detail & Related papers (2024-05-30T21:36:12Z) - Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - Estimation and Inference in Distributional Reinforcement Learning [28.253677740976197]
We show that a dataset of size $widetilde Oleft(frac|mathcalS||mathcalA|epsilon2 (1-gamma)4right)$ suffices to ensure the Kolmogorov metric and total variation metric between $hatetapi$ and $etapi$ is below $epsilon$ with high probability.
Our findings give rise to a unified approach to statistical inference of a wide class of statistical functionals of $etapi$.
arXiv Detail & Related papers (2023-09-29T14:14:53Z) - Differentiated uniformization: A new method for inferring Markov chains
on combinatorial state spaces including stochastic epidemic models [0.0]
We provide an analogous algorithm for computing $partialexp!(tQ)theta$.
We estimate monthly infection and recovery rates during the first wave of the COVID-19 pandemic in Austria.
arXiv Detail & Related papers (2021-12-21T03:59:06Z) - Reward-Free Model-Based Reinforcement Learning with Linear Function
Approximation [92.99933928528797]
We study the model-based reward-free reinforcement learning with linear function approximation for episodic Markov decision processes (MDPs)
In the planning phase, the agent is given a specific reward function and uses samples collected from the exploration phase to learn a good policy.
We show that to obtain an $epsilon$-optimal policy for arbitrary reward function, UCRL-RFE needs to sample at most $tilde O(H4d(H + d)epsilon-2)$ episodes.
arXiv Detail & Related papers (2021-10-12T23:03:58Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs
with a Generative Model [3.749193647980305]
This paper considers a Markov decision process (MDP) that admits a set of state-action features.
We show that a model-based approach (resp.$$Q-learning) provably learns an $varepsilon$-optimal policy with high probability.
arXiv Detail & Related papers (2021-05-28T17:49:39Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - $\pi$VAE: a stochastic process prior for Bayesian deep learning with
MCMC [2.4792948967354236]
We propose a novel variational autoencoder called the prior encodingal autoencoder ($pi$VAE)
We show that our framework can accurately learn expressive function classes such as Gaussian processes, but also properties of functions to enable statistical inference.
Perhaps most usefully, we demonstrate that the low dimensional distributed latent space representation learnt provides an elegant and scalable means of performing inference for processes within programming languages such as Stan.
arXiv Detail & Related papers (2020-02-17T10:23:18Z) - Does generalization performance of $l^q$ regularization learning depend
on $q$? A negative example [19.945160684285003]
$lq$-regularization has been demonstrated to be an attractive technique in machine learning and statistical modeling.
We show that all $lq$ estimators for $0 infty$ attain similar generalization error bounds.
This finding tentatively reveals that, in some modeling contexts, the choice of $q$ might not have a strong impact in terms of the generalization capability.
arXiv Detail & Related papers (2013-07-25T00:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.