Transfer Learning Enhanced DeepONet for Long-Time Prediction of
Evolution Equations
- URL: http://arxiv.org/abs/2212.04663v1
- Date: Fri, 9 Dec 2022 04:37:08 GMT
- Title: Transfer Learning Enhanced DeepONet for Long-Time Prediction of
Evolution Equations
- Authors: Wuzhe Xu, Yulong Lu and Li Wang
- Abstract summary: Deep operator network (DeepONet) has demonstrated great success in various learning tasks.
This paper proposes a em transfer-learning aided DeepONet to enhance the stability.
- Score: 9.748550197032785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep operator network (DeepONet) has demonstrated great success in various
learning tasks, including learning solution operators of partial differential
equations. In particular, it provides an efficient approach to predict the
evolution equations in a finite time horizon. Nevertheless, the vanilla
DeepONet suffers from the issue of stability degradation in the long-time
prediction. This paper proposes a {\em transfer-learning} aided DeepONet to
enhance the stability. Our idea is to use transfer learning to sequentially
update the DeepONets as the surrogates for propagators learned in different
time frames. The evolving DeepONets can better track the varying complexities
of the evolution equations, while only need to be updated by efficient training
of a tiny fraction of the operator networks. Through systematic experiments, we
show that the proposed method not only improves the long-time accuracy of
DeepONet while maintaining similar computational cost but also substantially
reduces the sample size of the training set.
Related papers
- DeepOSets: Non-Autoregressive In-Context Learning of Supervised Learning Operators [11.913853433712855]
We introduce DeepSets Operator Networks (DeepOSets), an efficient, non-autoregressive neural network architecture for in-context learning of permutation-invariant operators.
DeepOSets combines the operator learning capabilities of Deep Operator Networks (DeepONets) with the set learning capabilities of DeepSets.
arXiv Detail & Related papers (2024-10-11T23:07:19Z) - On discretisation drift and smoothness regularisation in neural network
training [0.0]
We aim to make steps towards an improved understanding of deep learning with a focus on optimisation and model regularisation.
We start by investigating gradient descent (GD), a discrete-time algorithm at the basis of most popular deep learning optimisation algorithms.
We derive novel continuous-time flows that account for discretisation drift. Unlike the NGF, these new flows can be used to describe learning rate specific behaviours of GD, such as training instabilities observed in supervised learning and two-player games.
We then translate insights from continuous time into mitigation strategies for unstable GD dynamics, by constructing novel learning rate schedules and regulariser
arXiv Detail & Related papers (2023-10-21T15:21:36Z) - Efficient Bayesian Updates for Deep Learning via Laplace Approximations [1.5996841879821277]
We propose a novel Bayesian update method for deep neural networks.
We leverage second-order optimization techniques on the Gaussian posterior distribution of a Laplace approximation.
A large-scale evaluation study confirms that our updates are a fast and competitive alternative to costly retraining.
arXiv Detail & Related papers (2022-10-12T12:16:46Z) - Learning Fast and Slow for Online Time Series Forecasting [76.50127663309604]
Fast and Slow learning Networks (FSNet) is a holistic framework for online time-series forecasting.
FSNet balances fast adaptation to recent changes and retrieving similar old knowledge.
Our code will be made publicly available.
arXiv Detail & Related papers (2022-02-23T18:23:07Z) - Training Efficiency and Robustness in Deep Learning [2.6451769337566406]
We study approaches to improve the training efficiency and robustness of deep learning models.
We find that prioritizing learning on more informative training data increases convergence speed and improves generalization performance on test data.
We show that a redundancy-aware modification to the sampling of training data improves the training speed and develops an efficient method for detecting the diversity of training signal.
arXiv Detail & Related papers (2021-12-02T17:11:33Z) - Improved architectures and training algorithms for deep operator
networks [0.0]
Operator learning techniques have emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces.
We analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel (NTK) theory.
arXiv Detail & Related papers (2021-10-04T18:34:41Z) - Improving the Accuracy of Early Exits in Multi-Exit Architectures via
Curriculum Learning [88.17413955380262]
Multi-exit architectures allow deep neural networks to terminate their execution early in order to adhere to tight deadlines at the cost of accuracy.
We introduce a novel method called Multi-Exit Curriculum Learning that utilizes curriculum learning.
Our method consistently improves the accuracy of early exits compared to the standard training approach.
arXiv Detail & Related papers (2021-04-21T11:12:35Z) - RIFLE: Backpropagation in Depth for Deep Transfer Learning through
Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task.
We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings.
RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z) - AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS)
Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.