Related papers: Constraint Breeds Generalization: Temporal Dynamics as an Inductive Bias

Constraint Breeds Generalization: Temporal Dynamics as an Inductive Bias

URL: http://arxiv.org/abs/2512.23916v1
Date: Tue, 30 Dec 2025 00:34:24 GMT
Title: Constraint Breeds Generalization: Temporal Dynamics as an Inductive Bias
Authors: Xia Chen,
Abstract summary: We show that constraints shape dynamics to function not as limitations, but as a temporal inductive bias that breeds generalization.<n>We show that robust AI development requires not only scaling and removing limitations, but computationally mastering the temporal characteristics that naturally promote generalization.
Score: 1.219017431258669
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conventional deep learning prioritizes unconstrained optimization, yet biological systems operate under strict metabolic constraints. We propose that these physical constraints shape dynamics to function not as limitations, but as a temporal inductive bias that breeds generalization. Through a phase-space analysis of signal propagation, we reveal a fundamental asymmetry: expansive dynamics amplify noise, whereas proper dissipative dynamics compress phase space that aligns with the network's spectral bias, compelling the abstraction of invariant features. This condition can be imposed externally via input encoding, or intrinsically through the network's own temporal dynamics. Both pathways require architectures capable of temporal integration and proper constraints to decode induced invariants, whereas static architectures fail to capitalize on temporal structure. Through comprehensive evaluations across supervised classification, unsupervised reconstruction, and zero-shot reinforcement learning, we demonstrate that a critical "transition" regime maximizes generalization capability. These findings establish dynamical constraints as a distinct class of inductive bias, suggesting that robust AI development requires not only scaling and removing limitations, but computationally mastering the temporal characteristics that naturally promote generalization.

Related papers

KoopGen: Koopman Generator Networks for Representing and Predicting Dynamical Systems with Continuous Spectra [65.11254608352982]
We introduce a generator-based neural Koopman framework that models dynamics through a structured, state-dependent representation of Koopman generators.<n>By exploiting the intrinsic Cartesian decomposition into skew-adjoint and self-adjoint components, KoopGen separates conservative transport from irreversible dissipation.
arXiv Detail & Related papers (2026-02-15T06:32:23Z)
Smooth embeddings in contracting recurrent networks driven by regular dynamics: A synthesis for neural representation [45.88028371034407]
Recent empirical work has documented topology-preserving latent organization in trained recurrent models.<n>Recent theoretical results in reservoir computing establish conditions under which the synchronization map is an embedding.<n>Our contribution is an integrated framework that assembles generalized synchronization and embedding guarantees for contracting reservoirs.
arXiv Detail & Related papers (2026-01-26T23:10:39Z)
Random-Matrix-Induced Simplicity Bias in Over-parameterized Variational Quantum Circuits [72.0643009153473]
We show that expressive variational ansatze enter a Haar-like universality class in which both observable expectation values and parameter gradients concentrate exponentially with system size.<n>As a consequence, the hypothesis class induced by such circuits collapses with high probability to a narrow family of near-constant functions.<n>We further show that this collapse is not unavoidable: tensor-structured VQCs, including tensor-network-based and tensor-hypernetwork parameterizations, lie outside the Haar-like universality class.
arXiv Detail & Related papers (2026-01-05T08:04:33Z)
A Mechanistic Analysis of Transformers for Dynamical Systems [4.590170084532207]
We study the representational capabilities and limitations of single-layer Transformers when applied to dynamical data.<n>For linear systems, we show that the convexity constraint imposed by softmax attention fundamentally restricts the class of dynamics that can be represented.<n>For nonlinear systems under partial observability, attention instead acts as an adaptive delay-embedding mechanism.
arXiv Detail & Related papers (2025-12-24T11:21:07Z)
When Does Learning Renormalize? Sufficient Conditions for Power Law Spectral Dynamics [2.779943773196378]
Empirical power--law scaling has been widely observed across modern deep learning systems.<n>We show that power--law scaling does not follow from renormalizability alone, but instead arises as a rigidity consequence.
arXiv Detail & Related papers (2025-12-20T04:15:07Z)
RainDiff: End-to-end Precipitation Nowcasting Via Token-wise Attention Diffusion [64.49056527678606]
We propose a Token-wise Attention integrated into not only the U-Net diffusion model but also the radar-temporal encoder.<n>Unlike prior approaches, our method integrates attention into the architecture without incurring the high resource cost typical of pixel-space diffusion.<n>Our experiments and evaluations demonstrate that the proposed method significantly outperforms state-of-the-art approaches, robustness local fidelity, generalization, and superior in complex precipitation forecasting scenarios.
arXiv Detail & Related papers (2025-10-16T17:59:13Z)
Information-Theoretic Bounds and Task-Centric Learning Complexity for Real-World Dynamic Nonlinear Systems [0.6875312133832079]
Dynamic nonlinear systems exhibit distortions arising from coupled static and dynamic effects.<n>This paper presents a theoretical framework grounded in structured decomposition, variance analysis, and task-centric complexity bounds.
arXiv Detail & Related papers (2025-09-08T12:08:02Z)
Mutual Information Free Topological Generalization Bounds via Stability [46.63069403118614]
We introduce a novel learning theoretic framework that departs from the existing strategies.<n>We prove that the generalization error of trajectory-stable algorithms can be upper bounded in terms of TDA terms.
arXiv Detail & Related papers (2025-07-09T12:03:25Z)
Generative System Dynamics in Recurrent Neural Networks [56.958984970518564]
We investigate the continuous time dynamics of Recurrent Neural Networks (RNNs)<n>We show that skew-symmetric weight matrices are fundamental to enable stable limit cycles in both linear and nonlinear configurations.<n> Numerical simulations showcase how nonlinear activation functions not only maintain limit cycles, but also enhance the numerical stability of the system integration process.
arXiv Detail & Related papers (2025-04-16T10:39:43Z)
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape [40.78854925996]
Large language models based on the Transformer architecture have demonstrated impressive ability to learn in context. We show that a common nonlinear representation or feature map can be used to enhance power of in-context learning.
arXiv Detail & Related papers (2024-02-02T09:29:40Z)
On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics. The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z)
On dissipative symplectic integration with applications to gradient-based optimization [77.34726150561087]
We propose a geometric framework in which discretizations can be realized systematically. We show that a generalization of symplectic to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error.
arXiv Detail & Related papers (2020-04-15T00:36:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.