Scalable Approximate Inference and Some Applications
- URL: http://arxiv.org/abs/2003.03515v1
- Date: Sat, 7 Mar 2020 04:33:27 GMT
- Title: Scalable Approximate Inference and Some Applications
- Authors: Jun Han
- Abstract summary: In this thesis, we propose a new framework for approximate inference.
Our proposed four algorithms are motivated by the recent computational progress of Stein's method.
Results on simulated and real datasets indicate the statistical efficiency and wide applicability of our algorithm.
- Score: 2.6541211006790983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Approximate inference in probability models is a fundamental task in machine
learning. Approximate inference provides powerful tools to Bayesian reasoning,
decision making, and Bayesian deep learning. The main goal is to estimate the
expectation of interested functions w.r.t. a target distribution. When it comes
to high dimensional probability models and large datasets, efficient
approximate inference becomes critically important. In this thesis, we propose
a new framework for approximate inference, which combines the advantages of
these three frameworks and overcomes their limitations. Our proposed four
algorithms are motivated by the recent computational progress of Stein's
method. Our proposed algorithms are applied to continuous and discrete
distributions under the setting when the gradient information of the target
distribution is available or unavailable. Theoretical analysis is provided to
prove the convergence of our proposed algorithms. Our adaptive IS algorithm
iteratively improves the importance proposal by functionally decreasing the KL
divergence between the updated proposal and the target. When the gradient of
the target is unavailable, our proposed sampling algorithm leverages the
gradient of a surrogate model and corrects induced bias with importance
weights, which significantly outperforms other gradient-free sampling
algorithms. In addition, our theoretical results enable us to perform the
goodness-of-fit test on discrete distributions. At the end of the thesis, we
propose an importance-weighted method to efficiently aggregate local models in
distributed learning with one-shot communication. Results on simulated and real
datasets indicate the statistical efficiency and wide applicability of our
algorithm.
Related papers
- Adaptive importance sampling for heavy-tailed distributions via
$\alpha$-divergence minimization [2.879807093604632]
We propose an AIS algorithm that approximates the target by Student-t proposal distributions.
We adapt location and scale parameters by matching the escort moments of the target and the proposal.
These updates minimize the $alpha$-divergence between the target and the proposal, thereby connecting with variational inference.
arXiv Detail & Related papers (2023-10-25T14:07:08Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Distributionally Robust Machine Learning with Multi-source Data [6.383451076043423]
We introduce a group distributionally robust prediction model to optimize an adversarial reward about explained variance with respect to a class of target distributions.
Compared to classical empirical risk minimization, the proposed robust prediction model improves the prediction accuracy for target populations with distribution shifts.
We demonstrate the performance of our proposed group distributionally robust method on simulated and real data with random forests and neural networks as base-learning algorithms.
arXiv Detail & Related papers (2023-09-05T13:19:40Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Approximate Bayesian Optimisation for Neural Networks [6.921210544516486]
A body of work has been done to automate machine learning algorithm to highlight the importance of model choice.
The necessity to solve the analytical tractability and the computational feasibility in a idealistic fashion enables to ensure the efficiency and the applicability.
arXiv Detail & Related papers (2021-08-27T19:03:32Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - A Distributional Analysis of Sampling-Based Reinforcement Learning
Algorithms [67.67377846416106]
We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.
We show that value-based methods such as TD($lambda$) and $Q$-Learning have update rules which are contractive in the space of distributions of functions.
arXiv Detail & Related papers (2020-03-27T05:13:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.