Related papers: Scalable Approximate Inference and Some Applications

Scalable Approximate Inference and Some Applications

URL: http://arxiv.org/abs/2003.03515v1
Date: Sat, 7 Mar 2020 04:33:27 GMT
Title: Scalable Approximate Inference and Some Applications
Authors: Jun Han
Abstract summary: In this thesis, we propose a new framework for approximate inference. Our proposed four algorithms are motivated by the recent computational progress of Stein's method. Results on simulated and real datasets indicate the statistical efficiency and wide applicability of our algorithm.
Score: 2.6541211006790983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Approximate inference in probability models is a fundamental task in machine learning. Approximate inference provides powerful tools to Bayesian reasoning, decision making, and Bayesian deep learning. The main goal is to estimate the expectation of interested functions w.r.t. a target distribution. When it comes to high dimensional probability models and large datasets, efficient approximate inference becomes critically important. In this thesis, we propose a new framework for approximate inference, which combines the advantages of these three frameworks and overcomes their limitations. Our proposed four algorithms are motivated by the recent computational progress of Stein's method. Our proposed algorithms are applied to continuous and discrete distributions under the setting when the gradient information of the target distribution is available or unavailable. Theoretical analysis is provided to prove the convergence of our proposed algorithms. Our adaptive IS algorithm iteratively improves the importance proposal by functionally decreasing the KL divergence between the updated proposal and the target. When the gradient of the target is unavailable, our proposed sampling algorithm leverages the gradient of a surrogate model and corrects induced bias with importance weights, which significantly outperforms other gradient-free sampling algorithms. In addition, our theoretical results enable us to perform the goodness-of-fit test on discrete distributions. At the end of the thesis, we propose an importance-weighted method to efficiently aggregate local models in distributed learning with one-shot communication. Results on simulated and real datasets indicate the statistical efficiency and wide applicability of our algorithm.

Related papers

Fisher Random Walk: Automatic Debiasing Contextual Preference Inference for Large Language Model Evaluation [46.643610591694376]
We develop a semiparametric efficient estimator that automates the debiased estimation.<n>We show that the efficiency is achieved when the weights are derived from a novel strategy called Fisher random walk.
arXiv Detail & Related papers (2025-09-06T22:29:17Z)
Approximate Bayesian Inference via Bitstring Representations [15.754246455963907]
We propose performing probabilistic inference in the quantized, discrete parameter space created by representations.<n>This work advances scalable, interpretable machine learning by utilizing discrete approximations for probabilistic computations.
arXiv Detail & Related papers (2025-08-19T08:08:17Z)
On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling on a continuous domain for the data prediction task of (multimodal) self-supervised representation learning. We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning. We propose a novel non-parametric method for approximating the sum of conditional probability densities required by MIS.
arXiv Detail & Related papers (2024-10-11T18:02:46Z)
Inference for an Algorithmic Fairness-Accuracy Frontier [0.9147443443422864]
We provide a consistent estimator for a theoretical fairness-accuracy frontier put forward by Liang, Lu and Mu (2023) We propose inference methods to test hypotheses that have received much attention in the fairness literature. We show that the estimated support function converges to a tight process as the sample size increases.
arXiv Detail & Related papers (2024-02-14T00:56:09Z)
Adaptive importance sampling for heavy-tailed distributions via $\alpha$-divergence minimization [2.879807093604632]
We propose an AIS algorithm that approximates the target by Student-t proposal distributions. We adapt location and scale parameters by matching the escort moments of the target and the proposal. These updates minimize the $alpha$-divergence between the target and the proposal, thereby connecting with variational inference.
arXiv Detail & Related papers (2023-10-25T14:07:08Z)
Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS) We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises. We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z)
Distributionally Robust Machine Learning with Multi-source Data [6.383451076043423]
We introduce a group distributionally robust prediction model to optimize an adversarial reward about explained variance with respect to a class of target distributions. Compared to classical empirical risk minimization, the proposed robust prediction model improves the prediction accuracy for target populations with distribution shifts. We demonstrate the performance of our proposed group distributionally robust method on simulated and real data with random forests and neural networks as base-learning algorithms.
arXiv Detail & Related papers (2023-09-05T13:19:40Z)
Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation. Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions. We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z)
Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders. Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector. We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z)
Approximate Bayesian Optimisation for Neural Networks [6.921210544516486]
A body of work has been done to automate machine learning algorithm to highlight the importance of model choice. The necessity to solve the analytical tractability and the computational feasibility in a idealistic fashion enables to ensure the efficiency and the applicability.
arXiv Detail & Related papers (2021-08-27T19:03:32Z)
Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment. Policy gradients for local search are often obtained from random perturbations. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms [67.67377846416106]
We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes. We show that value-based methods such as TD($lambda$) and $Q$-Learning have update rules which are contractive in the space of distributions of functions.
arXiv Detail & Related papers (2020-03-27T05:13:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.