Related papers: Sample as You Infer: Predictive Coding With Langevin Dynamics

Sample as You Infer: Predictive Coding With Langevin Dynamics

URL: http://arxiv.org/abs/2311.13664v2
Date: Sun, 4 Feb 2024 22:29:41 GMT
Title: Sample as You Infer: Predictive Coding With Langevin Dynamics
Authors: Umais Zahid, Qinghai Guo, Zafeirios Fountas
Abstract summary: We present a novel algorithm for parameter learning in generic deep generative models. Our approach modifies the standard PC algorithm to bring performance on-par and exceeding that obtained from standard variational auto-encoder training.
Score: 11.515490109360012
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a novel algorithm for parameter learning in generic deep generative models that builds upon the predictive coding (PC) framework of computational neuroscience. Our approach modifies the standard PC algorithm to bring performance on-par and exceeding that obtained from standard variational auto-encoder (VAE) training. By injecting Gaussian noise into the PC inference procedure we re-envision it as an overdamped Langevin sampling, which facilitates optimisation with respect to a tight evidence lower bound (ELBO). We improve the resultant encoder-free training method by incorporating an encoder network to provide an amortised warm-start to our Langevin sampling and test three different objectives for doing so. Finally, to increase robustness to the sampling step size and reduce sensitivity to curvature, we validate a lightweight and easily computable form of preconditioning, inspired by Riemann Manifold Langevin and adaptive optimizers from the SGD literature. We compare against VAEs by training like-for-like generative models using our technique against those trained with standard reparameterisation-trick-based ELBOs. We observe our method out-performs or matches performance across a number of metrics, including sample quality, while converging in a fraction of the number of SGD training iterations.

Related papers

Repurposing Protein Language Models for Latent Flow-Based Fitness Optimization [24.267946140577806]
CHASE is a framework that repurposes the evolutionary knowledge of pretrained protein language models.<n>It achieves state-of-the-art performance on AAV and GFP protein design benchmarks.
arXiv Detail & Related papers (2026-02-02T18:25:33Z)
Zero-Variance Gradients for Variational Autoencoders [32.818968022327866]
Training deep generative models like Variational Autoencoders (VAEs) is often hindered by the need to backpropagate gradients through sampling of their latent variables.<n>In this paper, we propose a new perspective that sidesteps this problem, which we call Silent Gradients.<n>Instead of improving estimators, we leverage specific decoder architectures analytically to compute the expected ELBO, yielding a gradient with zero variance.
arXiv Detail & Related papers (2025-08-05T15:54:21Z)
CoVAE: Consistency Training of Variational Autoencoders [9.358185536754537]
We propose a novel single-stage generative autoencoding framework that adopts techniques from consistency models to train a VAE architecture.<n>We show that CoVAE can generate high-quality samples in one or few steps without the use of a learned prior.<n>Our approach provides a unified framework for autoencoding and diffusion-style generative modeling and provides a viable route for one-step generative high-performance autoencoding.
arXiv Detail & Related papers (2025-07-12T01:32:08Z)
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [70.8832906871441]
We study how to steer generation toward desired rewards without retraining the models.<n>Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement.<n>We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity.
arXiv Detail & Related papers (2025-07-11T08:00:47Z)
Neural Conformal Control for Time Series Forecasting [54.96087475179419]
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments. Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders. We empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
arXiv Detail & Related papers (2024-12-24T03:56:25Z)
Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training [10.229653770070202]
Training generative models with differential privacy (DP) typically involves injecting noise into gradient updates or adapting the discriminator's training procedure. We consider the slicing privacy mechanism that injects noise into random low-dimensional projections of the private data. We present a kernel-based estimator for this divergence, circumventing the need for adversarial training.
arXiv Detail & Related papers (2024-10-25T19:32:58Z)
Score-based Generative Models with Adaptive Momentum [40.84399531998246]
We propose an adaptive momentum sampling method to accelerate the transforming process. We show that our method can produce more faithful images/graphs in small sampling steps with 2 to 5 times speed up.
arXiv Detail & Related papers (2024-05-22T15:20:27Z)
Edge-Efficient Deep Learning Models for Automatic Modulation Classification: A Performance Analysis [0.7428236410246183]
We investigate optimized convolutional neural networks (CNNs) developed for automatic modulation classification (AMC) of wireless signals. We propose optimized models with the combinations of these techniques to fuse the complementary optimization benefits. The experimental results show that the proposed individual and combined optimization techniques are highly effective for developing models with significantly less complexity.
arXiv Detail & Related papers (2024-04-11T06:08:23Z)
Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks. We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs) GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations. We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z)
Model Selection for Bayesian Autoencoders [25.619565817793422]
We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution. We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space. We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
arXiv Detail & Related papers (2021-06-11T08:55:00Z)
A Distributed Optimisation Framework Combining Natural Gradient with Hessian-Free for Discriminative Sequence Training [16.83036203524611]
This paper presents a novel natural gradient and Hessian-free (NGHF) optimisation framework for neural network training. It relies on the linear conjugate gradient (CG) algorithm to combine the natural gradient (NG) method with local curvature information from Hessian-free (HF) or other second-order methods. Experiments are reported on the multi-genre broadcast data set for a range of different acoustic model types.
arXiv Detail & Related papers (2021-03-12T22:18:34Z)
Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency. We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
Learning to Optimize Non-Rigid Tracking [54.94145312763044]
We employ learnable optimizations to improve robustness and speed up solver convergence. First, we upgrade the tracking objective by integrating an alignment data term on deep features which are learned end-to-end through CNN. Second, we bridge the gap between the preconditioning technique and learning method by introducing a ConditionNet which is trained to generate a preconditioner.
arXiv Detail & Related papers (2020-03-27T04:40:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.