Statistical mechanics of continual learning: variational principle and
mean-field potential
- URL: http://arxiv.org/abs/2212.02846v4
- Date: Tue, 20 Jun 2023 04:41:22 GMT
- Title: Statistical mechanics of continual learning: variational principle and
mean-field potential
- Authors: Chan Li and Zhenye Huang and Wenxuan Zou and Haiping Huang
- Abstract summary: We focus on continual learning in single-layered and multi-layered neural networks of binary weights.
A variational Bayesian learning setting is proposed, where the neural networks are trained in a field-space.
Weight uncertainty is naturally incorporated, and modulates synaptic resources among tasks.
Our proposed frameworks also connect to elastic weight consolidation, weight-uncertainty learning, and neuroscience inspired metaplasticity.
- Score: 1.559929646151698
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An obstacle to artificial general intelligence is set by continual learning
of multiple tasks of different nature. Recently, various heuristic tricks, both
from machine learning and from neuroscience angles, were proposed, but they
lack a unified theory ground. Here, we focus on continual learning in
single-layered and multi-layered neural networks of binary weights. A
variational Bayesian learning setting is thus proposed, where the neural
networks are trained in a field-space, rather than gradient-ill-defined
discrete-weight space, and furthermore, weight uncertainty is naturally
incorporated, and modulates synaptic resources among tasks. From a physics
perspective, we translate the variational continual learning into Franz-Parisi
thermodynamic potential framework, where previous task knowledge acts as a
prior and a reference as well. We thus interpret the continual learning of the
binary perceptron in a teacher-student setting as a Franz-Parisi potential
computation. The learning performance can then be analytically studied with
mean-field order parameters, whose predictions coincide with numerical
experiments using stochastic gradient descent methods. Based on the variational
principle and Gaussian field approximation of internal preactivations in hidden
layers, we also derive the learning algorithm considering weight uncertainty,
which solves the continual learning with binary weights using multi-layered
neural networks, and performs better than the currently available
metaplasticity algorithm. Our proposed principled frameworks also connect to
elastic weight consolidation, weight-uncertainty modulated learning, and
neuroscience inspired metaplasticity, providing a theory-grounded method for
the real-world multi-task learning with deep networks.
Related papers
- From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks [47.13391046553908]
In artificial networks, the effectiveness of these models relies on their ability to build task specific representation.
Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically.
These solutions capture the evolution of representations and the Neural Kernel across the spectrum from the rich to the lazy regimes.
arXiv Detail & Related papers (2024-09-22T23:19:04Z) - ShadowNet for Data-Centric Quantum System Learning [188.683909185536]
We propose a data-centric learning paradigm combining the strength of neural-network protocols and classical shadows.
Capitalizing on the generalization power of neural networks, this paradigm can be trained offline and excel at predicting previously unseen systems.
We present the instantiation of our paradigm in quantum state tomography and direct fidelity estimation tasks and conduct numerical analysis up to 60 qubits.
arXiv Detail & Related papers (2023-08-22T09:11:53Z) - IF2Net: Innately Forgetting-Free Networks for Continual Learning [49.57495829364827]
Continual learning can incrementally absorb new concepts without interfering with previously learned knowledge.
Motivated by the characteristics of neural networks, we investigated how to design an Innately Forgetting-Free Network (IF2Net)
IF2Net allows a single network to inherently learn unlimited mapping rules without telling task identities at test time.
arXiv Detail & Related papers (2023-06-18T05:26:49Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Bayesian Continual Learning via Spiking Neural Networks [38.518936229794214]
We take steps towards the design of neuromorphic systems that are capable of adaptation to changing learning tasks.
We derive online learning rules for spiking neural networks (SNNs) within a Bayesian continual learning framework.
We instantiate the proposed approach for both real-valued and binary synaptic weights.
arXiv Detail & Related papers (2022-08-29T17:11:14Z) - The least-control principle for learning at equilibrium [65.2998274413952]
We present a new principle for learning equilibrium recurrent neural networks, deep equilibrium models, or meta-learning.
Our results shed light on how the brain might learn and offer new ways of approaching a broad class of machine learning problems.
arXiv Detail & Related papers (2022-07-04T11:27:08Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Training multi-objective/multi-task collocation physics-informed neural
network with student/teachers transfer learnings [0.0]
This paper presents a PINN training framework that employs pre-training steps and a net-to-net knowledge transfer algorithm.
A multi-objective optimization algorithm may improve the performance of a physical-informed neural network with competing constraints.
arXiv Detail & Related papers (2021-07-24T00:43:17Z) - Identifying Learning Rules From Neural Network Observables [26.96375335939315]
We show that different classes of learning rules can be separated solely on the basis of aggregate statistics of the weights, activations, or instantaneous layer-wise activity changes.
Our results suggest that activation patterns, available from electrophysiological recordings of post-synaptic activities, may provide a good basis on which to identify learning rules.
arXiv Detail & Related papers (2020-10-22T14:36:54Z) - Understanding and mitigating gradient pathologies in physics-informed
neural networks [2.1485350418225244]
This work focuses on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data.
We present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions.
We also propose a novel neural network architecture that is more resilient to such gradient pathologies.
arXiv Detail & Related papers (2020-01-13T21:23:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.