Dreaming Learning
- URL: http://arxiv.org/abs/2410.18156v1
- Date: Wed, 23 Oct 2024 09:17:31 GMT
- Title: Dreaming Learning
- Authors: Alessandro Londei, Matteo Benati, Denise Lanzieri, Vittorio Loreto,
- Abstract summary: Introducing new information to a machine learning system can interfere with previously stored data.
We propose a training algorithm inspired by Stuart Kauffman's notion of the Adjacent Possible.
It predisposes the neural network to smoothly accept and integrate data sequences with different statistical characteristics than expected.
- Score: 41.94295877935867
- License:
- Abstract: Incorporating novelties into deep learning systems remains a challenging problem. Introducing new information to a machine learning system can interfere with previously stored data and potentially alter the global model paradigm, especially when dealing with non-stationary sources. In such cases, traditional approaches based on validation error minimization offer limited advantages. To address this, we propose a training algorithm inspired by Stuart Kauffman's notion of the Adjacent Possible. This novel training methodology explores new data spaces during the learning phase. It predisposes the neural network to smoothly accept and integrate data sequences with different statistical characteristics than expected. The maximum distance compatible with such inclusion depends on a specific parameter: the sampling temperature used in the explorative phase of the present method. This algorithm, called Dreaming Learning, anticipates potential regime shifts over time, enhancing the neural network's responsiveness to non-stationary events that alter statistical properties. To assess the advantages of this approach, we apply this methodology to unexpected statistical changes in Markov chains and non-stationary dynamics in textual sequences. We demonstrated its ability to improve the auto-correlation of generated textual sequences by $\sim 29\%$ and enhance the velocity of loss convergence by $\sim 100\%$ in the case of a paradigm shift in Markov chains.
Related papers
- Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset [98.52916361979503]
We introduce a novel learning approach that automatically models and adapts to non-stationarity.
We show empirically that our approach performs well in non-stationary supervised and off-policy reinforcement learning settings.
arXiv Detail & Related papers (2024-11-06T16:32:40Z) - Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Leveraging viscous Hamilton-Jacobi PDEs for uncertainty quantification in scientific machine learning [1.8175282137722093]
Uncertainty (UQ) in scientific machine learning (SciML) combines the powerful predictive power of SciML with methods for quantifying the reliability of the learned models.
We provide a new interpretation for UQ problems by establishing a new theoretical connection between some Bayesian inference problems arising in SciML and viscous Hamilton-Jacobi partial differential equations (HJ PDEs)
We develop a new Riccati-based methodology that provides computational advantages when continuously updating the model predictions.
arXiv Detail & Related papers (2024-04-12T20:54:01Z) - Liquid Neural Network-based Adaptive Learning vs. Incremental Learning for Link Load Prediction amid Concept Drift due to Network Failures [37.66676003679306]
Adapting to concept drift is a challenging task in machine learning.
In communication networks, such issue emerges when performing traffic forecasting following afailure event.
We propose an approach that exploits adaptive learning algorithms, namely, liquid neural networks, which are capable of self-adaptation to abrupt changes in data patterns without requiring any retraining.
arXiv Detail & Related papers (2024-04-08T08:47:46Z) - Tracking changes using Kullback-Leibler divergence for the continual
learning [2.0305676256390934]
This article introduces a novel method for monitoring changes in the probabilistic distribution of multi-dimensional data streams.
As a measure of the rapidity of changes, we analyze the popular Kullback-Leibler divergence.
We show how to use this metric to predict the concept drift occurrence and understand its nature.
arXiv Detail & Related papers (2022-10-10T17:30:41Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Learning Invariant Weights in Neural Networks [16.127299898156203]
Many commonly used models in machine learning are constraint to respect certain symmetries in the data.
We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks.
arXiv Detail & Related papers (2022-02-25T00:17:09Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Stochastic embeddings of dynamical phenomena through variational
autoencoders [1.7205106391379026]
We use a recognition network to increase the observed space dimensionality during the reconstruction of the phase space.
Our validation shows that this approach not only recovers a state space that resembles the original one, but it is also able to synthetize new time series.
arXiv Detail & Related papers (2020-10-13T10:10:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.