Conditional Gaussian PAC-Bayes
- URL: http://arxiv.org/abs/2110.11886v1
- Date: Fri, 22 Oct 2021 16:12:03 GMT
- Title: Conditional Gaussian PAC-Bayes
- Authors: Eugenio Clerico, George Deligiannidis, and Arnaud Doucet
- Abstract summary: The present paper proposes a novel training algorithm that optimises the PAC-Bayesian bound, without relying on any surrogate loss.
Empirical results show that the bounds obtained with this approach are tighter than those found in the literature.
- Score: 19.556744028461004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent studies have empirically investigated different methods to train a
stochastic classifier by optimising a PAC-Bayesian bound via stochastic
gradient descent. Most of these procedures need to replace the
misclassification error with a surrogate loss, leading to a mismatch between
the optimisation objective and the actual generalisation bound. The present
paper proposes a novel training algorithm that optimises the PAC-Bayesian
bound, without relying on any surrogate loss. Empirical results show that the
bounds obtained with this approach are tighter than those found in the
literature.
Related papers
- Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.
To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.
Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Improving Generalization of Complex Models under Unbounded Loss Using PAC-Bayes Bounds [10.94126149188336]
PAC-Bayes learning theory has focused extensively on establishing tight upper bounds for test errors.
A recently proposed training procedure called PAC-Bayes training, updates the model toward minimizing these bounds.
This approach is theoretically sound, in practice, it has not achieved a test error as low as those obtained by empirical risk minimization (ERM)
We introduce a new PAC-Bayes training algorithm with improved performance and reduced reliance on prior tuning.
arXiv Detail & Related papers (2023-05-30T17:31:25Z) - Limitations of Information-Theoretic Generalization Bounds for Gradient
Descent Methods in Stochastic Convex Optimization [48.12845778927164]
We consider the prospect of establishing minimax rates for gradient descent in the setting of convex optimization.
We consider a common tactic employed in studying gradient methods, whereby the final iterate is corrupted by Gaussian noise, producing a noisy "surrogate" algorithm.
Our results suggest that new ideas are required to analyze gradient descent using information-theoretic techniques.
arXiv Detail & Related papers (2022-12-27T17:16:48Z) - Constrained Reinforcement Learning via Dissipative Saddle Flow Dynamics [5.270497591225775]
In constrained reinforcement learning (C-RL), an agent seeks to learn from the environment a policy that maximizes the expected cumulative reward.
Several algorithms rooted in sampled-based primal-dual methods have been recently proposed to solve this problem in policy space.
We propose a novel algorithm for constrained RL that does not suffer from these limitations.
arXiv Detail & Related papers (2022-12-03T01:54:55Z) - Improving Robust Generalization by Direct PAC-Bayesian Bound
Minimization [27.31806334022094]
Recent research has shown an overfitting-like phenomenon in which models trained against adversarial attacks exhibit higher robustness on the training set compared to the test set.
In this paper we consider a different form of the robust PAC-Bayesian bound and directly minimize it with respect to the model posterior.
We evaluate our TrH regularization approach over CIFAR-10/100 and ImageNet using Vision Transformers (ViT) and compare against baseline adversarial robustness algorithms.
arXiv Detail & Related papers (2022-11-22T23:12:00Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Re-Assessing the "Classify and Count" Quantification Method [88.60021378715636]
"Classify and Count" (CC) is often a biased estimator.
Previous works have failed to use properly optimised versions of CC.
We argue that, while still inferior to some cutting-edge methods, they deliver near-state-of-the-art accuracy.
arXiv Detail & Related papers (2020-11-04T21:47:39Z) - Variance-Reduced Off-Policy Memory-Efficient Policy Search [61.23789485979057]
Off-policy policy optimization is a challenging problem in reinforcement learning.
Off-policy algorithms are memory-efficient and capable of learning from off-policy samples.
arXiv Detail & Related papers (2020-09-14T16:22:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.