A multifidelity approach to continual learning for physical systems
- URL: http://arxiv.org/abs/2304.03894v2
- Date: Fri, 9 Feb 2024 22:38:25 GMT
- Title: A multifidelity approach to continual learning for physical systems
- Authors: Amanda Howard, Yucheng Fu, and Panos Stinis
- Abstract summary: We introduce a novel continual learning method based on multifidelity deep neural networks.
This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset.
- Score: 1.4218223473363278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel continual learning method based on multifidelity deep
neural networks. This method learns the correlation between the output of
previously trained models and the desired output of the model on the current
training dataset, limiting catastrophic forgetting. On its own the
multifidelity continual learning method shows robust results that limit
forgetting across several datasets. Additionally, we show that the
multifidelity method can be combined with existing continual learning methods,
including replay and memory aware synapses, to further limit catastrophic
forgetting. The proposed continual learning method is especially suited for
physical problems where the data satisfy the same physical laws on each domain,
or for physics-informed neural networks, because in these cases we expect there
to be a strong correlation between the output of the previous model and the
model on the current training domain.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Learning to Continually Learn with the Bayesian Principle [36.75558255534538]
In this work, we adopt the meta-learning paradigm to combine the strong representational power of neural networks and simple statistical models' robustness to forgetting.
Since the neural networks remain fixed during continual learning, they are protected from catastrophic forgetting.
arXiv Detail & Related papers (2024-05-29T04:53:31Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Diffusion-Generative Multi-Fidelity Learning for Physical Simulation [24.723536390322582]
We develop a diffusion-generative multi-fidelity learning method based on differential equations (SDE), where the generation is a continuous denoising process.
By conditioning on additional inputs (temporal or spacial variables), our model can efficiently learn and predict multi-dimensional solution arrays.
arXiv Detail & Related papers (2023-11-09T18:59:05Z) - Federated Unlearning via Active Forgetting [24.060724751342047]
We propose a novel federated unlearning framework based on incremental learning.
Our framework differs from existing federated unlearning methods that rely on approximate retraining or data influence estimation.
arXiv Detail & Related papers (2023-07-07T03:07:26Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - Consistency Training of Multi-exit Architectures for Sensor Data [0.07614628596146598]
We present a novel and architecture-agnostic approach for robust training of multi-exit architectures termed consistent exit training.
We leverage weak supervision to align model output with consistency training and jointly optimize dual-losses in a multi-task learning fashion over the exits in a network.
arXiv Detail & Related papers (2021-09-27T17:11:25Z) - Learning Dynamics from Noisy Measurements using Deep Learning with a
Runge-Kutta Constraint [9.36739413306697]
We discuss a methodology to learn differential equation(s) using noisy and sparsely sampled measurements.
In our methodology, the main innovation can be seen in of integration of deep neural networks with a classical numerical integration method.
arXiv Detail & Related papers (2021-09-23T15:43:45Z) - Emerging Trends in Federated Learning: From Model Fusion to Federated X Learning [65.06445195580622]
Federated learning is a new paradigm that decouples data collection and model training via multi-party computation and model aggregation.
We conduct a focused survey of federated learning in conjunction with other learning algorithms.
arXiv Detail & Related papers (2021-02-25T15:18:13Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.