Recurrent Expansion: A Pathway Toward the Next Generation of Deep Learning
- URL: http://arxiv.org/abs/2507.08828v1
- Date: Fri, 04 Jul 2025 19:26:48 GMT
- Title: Recurrent Expansion: A Pathway Toward the Next Generation of Deep Learning
- Authors: Tarek Berghout,
- Abstract summary: Recurrent Expansion (RE) is a new learning paradigm that advances beyond conventional Machine Learning (ML) and Deep Learning (DL)<n>RE emphasizes multiple mappings of data through identical deep architectures and analyzes their internal representations (i.e., feature maps) in conjunction with observed performance signals such as loss.<n>A scalable and adaptive variant, Sc-HMVRE, introduces selective mechanisms and scale diversity for real-world deployment.
- Score: 0.26107298043931204
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces Recurrent Expansion (RE) as a new learning paradigm that advances beyond conventional Machine Learning (ML) and Deep Learning (DL). While DL focuses on learning from static data representations, RE proposes an additional dimension: learning from the evolving behavior of models themselves. RE emphasizes multiple mappings of data through identical deep architectures and analyzes their internal representations (i.e., feature maps) in conjunction with observed performance signals such as loss. By incorporating these behavioral traces, RE enables iterative self-improvement, allowing each model version to gain insight from its predecessors. The framework is extended through Multiverse RE (MVRE), which aggregates signals from parallel model instances, and further through Heterogeneous MVRE (HMVRE), where models of varying architectures contribute diverse perspectives. A scalable and adaptive variant, Sc-HMVRE, introduces selective mechanisms and scale diversity for real-world deployment. Altogether, RE presents a shift in DL: from purely representational learning to behavior-aware, self-evolving systems. It lays the groundwork for a new class of intelligent models capable of reasoning over their own learning dynamics, offering a path toward scalable, introspective, and adaptive artificial intelligence. A simple code example to support beginners in running their own experiments is provided in Code Availability Section of this paper.
Related papers
- From Generation to Generalization: Emergent Few-Shot Learning in Video Diffusion Models [65.0487600936788]
Video Diffusion Models (VDMs) have emerged as powerful generative tools capable of synthesizing high-quality content.<n>We argue that VDMs naturally push to probe structured representations and an implicit understanding of the visual world.<n>Our method transforms each task into a visual transition, enabling the training of LoRA weights on short input- sequences.
arXiv Detail & Related papers (2025-06-08T20:52:34Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - Analyzing Fine-tuning Representation Shift for Multimodal LLMs Steering alignment [53.90425382758605]
We show how fine-tuning alters the internal structure of a model to specialize in new multimodal tasks.<n>Our work sheds light on how multimodal representations evolve through fine-tuning and offers a new perspective for interpreting model adaptation in multimodal tasks.
arXiv Detail & Related papers (2025-01-06T13:37:13Z) - State-space models can learn in-context by gradient descent [1.3087858009942543]
We show that state-space models can perform gradient-based learning and use it for in-context learning in much the same way as transformers.<n>Specifically, we prove that a single structured state-space model layer, augmented with multiplicative input and output gating, can reproduce the outputs of an implicit linear model.<n>We also provide novel insights into the relationship between state-space models and linear self-attention, and their ability to learn in-context.
arXiv Detail & Related papers (2024-10-15T15:22:38Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - RePo: Resilient Model-Based Reinforcement Learning by Regularizing
Posterior Predictability [25.943330238941602]
We propose a visual model-based RL method that learns a latent representation resilient to spurious variations.
Our training objective encourages the representation to be maximally predictive of dynamics and reward.
Our effort is a step towards making model-based RL a practical and useful tool for dynamic, diverse domains.
arXiv Detail & Related papers (2023-08-31T18:43:04Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Constructing Effective Machine Learning Models for the Sciences: A
Multidisciplinary Perspective [77.53142165205281]
We show how flexible non-linear solutions will not always improve upon manually adding transforms and interactions between variables to linear regression models.
We discuss how to recognize this before constructing a data-driven model and how such analysis can help us move to intrinsically interpretable regression models.
arXiv Detail & Related papers (2022-11-21T17:48:44Z) - Learning Variational Data Assimilation Models and Solvers [34.22350850350653]
We introduce end-to-end neural network architectures for data assimilation.
A key feature of the proposed end-to-end learning architecture is that we may train the NN models using both supervised and unsupervised strategies.
arXiv Detail & Related papers (2020-07-25T14:28:48Z) - A Convolutional Deep Markov Model for Unsupervised Speech Representation
Learning [32.59760685342343]
Probabilistic Latent Variable Models provide an alternative to self-supervised learning approaches for linguistic representation learning from speech.
In this work, we propose ConvDMM, a Gaussian state-space model with non-linear emission and transition functions modelled by deep neural networks.
When trained on a large scale speech dataset (LibriSpeech), ConvDMM produces features that significantly outperform multiple self-supervised feature extracting methods.
arXiv Detail & Related papers (2020-06-03T21:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.