Higher-Order Generalization Bounds: Learning Deep Probabilistic Programs
via PAC-Bayes Objectives
- URL: http://arxiv.org/abs/2203.15972v1
- Date: Wed, 30 Mar 2022 01:14:56 GMT
- Title: Higher-Order Generalization Bounds: Learning Deep Probabilistic Programs
via PAC-Bayes Objectives
- Authors: Jonathan Warrell, Mark Gerstein
- Abstract summary: We offer a framework for representing PAC-Bayes generalization bounds as programs using DPP-based methods.
In particular, we show that DPP techniques may be leveraged to derive generalization bounds that draw on the compositionality of DPP representations.
In turn, the bounds we introduce offer principled training objectives for higher-order probabilistic programs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Probabilistic Programming (DPP) allows powerful models based on
recursive computation to be learned using efficient deep-learning optimization
techniques. Additionally, DPP offers a unified perspective, where inference and
learning algorithms are treated on a par with models as stochastic programs.
Here, we offer a framework for representing and learning flexible PAC-Bayes
bounds as stochastic programs using DPP-based methods. In particular, we show
that DPP techniques may be leveraged to derive generalization bounds that draw
on the compositionality of DPP representations. In turn, the bounds we
introduce offer principled training objectives for higher-order probabilistic
programs. We offer a definition of a higher-order generalization bound, which
naturally encompasses single- and multi-task generalization perspectives
(including transfer- and meta-learning) and a novel class of bound based on a
learned measure of model complexity. Further, we show how modified forms of all
higher-order bounds can be efficiently optimized as objectives for DPP
training, using variational techniques. We test our framework using single- and
multi-task generalization settings on synthetic and biological data, showing
improved performance and generalization prediction using flexible DPP model
representations and learned complexity measures.
Related papers
- Machine Learning Optimized Orthogonal Basis Piecewise Polynomial Approximation [0.9208007322096533]
Piecewise Polynomials (PPs) are utilized in several engineering disciplines, like trajectory planning, to approximate position profiles given in the form of a set of points.
arXiv Detail & Related papers (2024-03-13T14:34:34Z) - Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks.
We present a generalization bound for meta-learning, which was first derived by Rothfuss et al.
We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z) - PAC-Bayesian Learning of Optimization Algorithms [6.624726878647541]
We apply the PAC-Bayes theory to the setting of learning-to-optimize.
We learn optimization algorithms with provable generalization guarantees (PAC-bounds) and explicit trade-off between a high probability of convergence and a high convergence speed.
Our results rely on PAC-Bayes bounds for general, unbounded loss-functions based on exponential families.
arXiv Detail & Related papers (2022-10-20T09:16:36Z) - A General Framework for Sample-Efficient Function Approximation in
Reinforcement Learning [132.45959478064736]
We propose a general framework that unifies model-based and model-free reinforcement learning.
We propose a novel estimation function with decomposable structural properties for optimization-based exploration.
Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed.
arXiv Detail & Related papers (2022-09-30T17:59:16Z) - PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems.
We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models.
We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Sequential Information Design: Markov Persuasion Process and Its
Efficient Reinforcement Learning [156.5667417159582]
This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs)
Planning in MPPs faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender.
We design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles.
arXiv Detail & Related papers (2022-02-22T05:41:43Z) - Deep Probabilistic Graphical Modeling [2.2691593216516863]
This thesis develops deep probabilistic graphical modeling (DPGM)
DPGM consists in leveraging deep learning (DL) to make PGM more flexible.
One model class we develop extends exponential family PCA using neural networks to improve predictive performance.
arXiv Detail & Related papers (2021-04-25T03:48:02Z) - Wasserstein Learning of Determinantal Point Processes [14.790452282691252]
We present a novel approach for learning DPPs that minimizes the Wasserstein distance between the model and data composed of observed subsets.
We show that our Wasserstein learning approach provides significantly improved predictive performance on a generative task compared to DPPs trained using MLE.
arXiv Detail & Related papers (2020-11-19T08:30:57Z) - Can We Learn Heuristics For Graphical Model Inference Using
Reinforcement Learning? [114.24881214319048]
We show that we can learn programs, i.e., policies, for solving inference in higher order Conditional Random Fields (CRFs) using reinforcement learning.
Our method solves inference tasks efficiently without imposing any constraints on the form of the potentials.
arXiv Detail & Related papers (2020-04-27T19:24:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.