Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning
- URL: http://arxiv.org/abs/2505.14635v2
- Date: Wed, 16 Jul 2025 16:00:09 GMT
- Title: Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning
- Authors: Benjamin Prada, Shion Matsumoto, Abdul Malik Zekri, Ankur Mali,
- Abstract summary: We prove that layerwise PC performs block-coordinate descent on the minimum description length objective.<n>We also prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds.<n>This is the first result offering formal generalization and convergence guarantees for PC-trained deep models.
- Score: 1.749935196721634
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(\theta) \le \hat{R}(\theta) + \frac{L(\theta)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation.
Related papers
- Regularized Online RLHF with Generalized Bilinear Preferences [68.44113000390544]
We consider the problem of contextual online RLHF with general preferences.<n>We adopt the Generalized Bilinear Preference Model to capture preferences via low-rank, skew-symmetric matrices.<n>We prove that the dual gap of the greedy policy is bounded by the square of the estimation error.
arXiv Detail & Related papers (2026-02-26T15:27:53Z) - Provable FDR Control for Deep Feature Selection: Deep MLPs and Beyond [0.0]
We develop a flexible feature selection framework based on deep neural networks that approximately controls the false discovery rate (FDR), a measure of Type-I error.<n>We show that each coordinate of gradient-based feature vector admits a marginal normal approximation, thereby supporting the validity of FDR control.
arXiv Detail & Related papers (2025-12-04T11:46:06Z) - The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I) [6.453417258264177]
This paper introduces Error-Bounded Predictive Coding ( EPC), a lossy text that leverages a Masked Language Model (MLM) as a decompressor.<n>Instead of storing a subset of original tokens, EPC allows the model to predict masked content and stores minimal, rank-based corrections only when the model's top prediction is incorrect.<n>We demonstrate that EPC consistently dominates Predictive Masking, offering superior fidelity at a significantly lower bit rate by more efficiently utilising the model's intrinsic knowledge.
arXiv Detail & Related papers (2025-10-25T08:18:31Z) - MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models [53.36415620647177]
Semi-structured sparsity offers a promising solution by strategically retaining $N$ elements out of every $M$ weights.<n>Existing (N:M)-compatible approaches typically fall into two categories: rule-based layerwise greedy search, which suffers from considerable errors, and gradient-driven learning, which incurs prohibitive training costs.<n>We propose a novel linear-space probabilistic framework named MaskPro, which aims to learn a prior categorical distribution for every $M$ consecutive weights and subsequently leverages this distribution to generate the (N:M)-sparsity throughout an $N$-way sampling
arXiv Detail & Related papers (2025-06-15T15:02:59Z) - Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation [8.35644084613785]
We introduce the maximal update parameterization ($mu$P) in the infinite-width limit for two representative designs of local targets.
By analyzing deep linear networks, we found that PC's gradients interpolate between first-order and Gauss-Newton-like gradients.
We demonstrate that, in specific standard settings, PC in the infinite-width limit behaves more similarly to the first-order gradient.
arXiv Detail & Related papers (2024-11-04T11:38:27Z) - Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - Is Inverse Reinforcement Learning Harder than Standard Reinforcement
Learning? A Theoretical Perspective [55.36819597141271]
Inverse Reinforcement Learning (IRL) -- the problem of learning reward functions from demonstrations of an emphexpert policy -- plays a critical role in developing intelligent systems.
This paper provides the first line of efficient IRL in vanilla offline and online settings using samples and runtime.
As an application, we show that the learned rewards can emphtransfer to another target MDP with suitable guarantees.
arXiv Detail & Related papers (2023-11-29T00:09:01Z) - Provably Efficient CVaR RL in Low-rank MDPs [58.58570425202862]
We study risk-sensitive Reinforcement Learning (RL)
We propose a novel Upper Confidence Bound (UCB) bonus-driven algorithm to balance interplay between exploration, exploitation, and representation learning in CVaR RL.
We prove that our algorithm achieves a sample complexity of $epsilon$-optimal CVaR, where $H$ is the length of each episode, $A$ is the capacity of action space, and $d$ is the dimension of representations.
arXiv Detail & Related papers (2023-11-20T17:44:40Z) - Fixing the NTK: From Neural Network Linearizations to Exact Convex
Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data.
A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z) - Improved Sample Complexity for Reward-free Reinforcement Learning under
Low-rank MDPs [43.53286390357673]
This paper focuses on reward-free reinforcement learning under low-rank MDP models.
We first provide the first known sample complexity lower bound for any algorithm under low-rank MDPs.
We then propose a novel model-based algorithm, coined RAFFLE, and show it can both find an $epsilon$-optimal policy and achieve an $epsilon$-accurate system identification.
arXiv Detail & Related papers (2023-03-20T04:39:39Z) - Curvature-Sensitive Predictive Coding with Approximate Laplace Monte
Carlo [1.1470070927586016]
Predictive coding (PC) accounts of perception now form one of the dominant computational theories of the brain.
Despite this, they have enjoyed little export to the broader field of machine learning.
This has been due to the poor performance of models trained with PC when evaluated by both sample quality and marginal likelihood.
arXiv Detail & Related papers (2023-03-09T01:29:58Z) - A Unified Algebraic Perspective on Lipschitz Neural Networks [88.14073994459586]
This paper introduces a novel perspective unifying various types of 1-Lipschitz neural networks.
We show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition.
Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers.
arXiv Detail & Related papers (2023-03-06T14:31:09Z) - Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs [24.256960622176305]
We propose the first (nearly) matching upper and lower bounds on the sample complexity of PAC RL in episodic Markov decision processes.
Our bounds feature a new notion of sub-optimality gap for state-action pairs that we call the deterministic return gap.
Their design and analyses employ novel ideas, including graph-theoretical concepts such as minimum flows and maximum cuts.
arXiv Detail & Related papers (2022-03-17T11:19:41Z) - MDPGT: Momentum-based Decentralized Policy Gradient Tracking [29.22173174168708]
We propose a momentum-based decentralized policy gradient tracking (MDPGT) for multi-agent reinforcement learning.
MDPGT achieves the best available sample complexity of $mathcalO(N-1epsilon-3)$ for converging to an $epsilon-stationary point of the global average of $N$ local performance functions.
This outperforms the state-of-the-art sample complexity in decentralized model-free reinforcement learning.
arXiv Detail & Related papers (2021-12-06T06:55:51Z) - MMCGAN: Generative Adversarial Network with Explicit Manifold Prior [78.58159882218378]
We propose to employ explicit manifold learning as prior to alleviate mode collapse and stabilize training of GAN.
Our experiments on both the toy data and real datasets show the effectiveness of MMCGAN in alleviating mode collapse, stabilizing training, and improving the quality of generated samples.
arXiv Detail & Related papers (2020-06-18T07:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.