DeepONet as a Multi-Operator Extrapolation Model: Distributed Pretraining with Physics-Informed Fine-Tuning
- URL: http://arxiv.org/abs/2411.07239v1
- Date: Mon, 11 Nov 2024 18:58:46 GMT
- Title: DeepONet as a Multi-Operator Extrapolation Model: Distributed Pretraining with Physics-Informed Fine-Tuning
- Authors: Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, Hayden Schaeffer,
- Abstract summary: We propose a novel fine-tuning method to achieve multi-operator learning.
Our approach combines distributed learning to integrate data from various operators in pre-training, while physics-informed methods enable zero-shot fine-tuning.
- Score: 6.635683993472882
- License:
- Abstract: We propose a novel fine-tuning method to achieve multi-operator learning through training a distributed neural operator with diverse function data and then zero-shot fine-tuning the neural network using physics-informed losses for downstream tasks. Operator learning effectively approximates solution operators for PDEs and various PDE-related problems, yet it often struggles to generalize to new tasks. To address this, we investigate fine-tuning a pretrained model, while carefully selecting an initialization that enables rapid adaptation to new tasks with minimal data. Our approach combines distributed learning to integrate data from various operators in pre-training, while physics-informed methods enable zero-shot fine-tuning, minimizing the reliance on downstream data. We investigate standard fine-tuning and Low-Rank Adaptation fine-tuning, applying both to train complex nonlinear target operators that are difficult to learn only using random initialization. Through comprehensive numerical examples, we demonstrate the advantages of our approach, showcasing significant improvements in accuracy. Our findings provide a robust framework for advancing multi-operator learning and highlight the potential of transfer learning techniques in this domain.
Related papers
- LeMON: Learning to Learn Multi-Operator Networks [0.6554326244334868]
Single-operator learning involves training a deep neural network to learn a specific operator.
Recent work in multi-operator learning uses an operator embedding structure to train a single neural network on data from multiple operators.
We propose pretraining and fine-tuning strategies for solving PDEs using multi-operator learning.
arXiv Detail & Related papers (2024-08-28T23:20:03Z) - Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - DeltaPhi: Learning Physical Trajectory Residual for PDE Solving [54.13671100638092]
We propose and formulate the Physical Trajectory Residual Learning (DeltaPhi)
We learn the surrogate model for the residual operator mapping based on existing neural operator networks.
We conclude that, compared to direct learning, physical residual learning is preferred for PDE solving.
arXiv Detail & Related papers (2024-06-14T07:45:07Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Continual Learning with Pretrained Backbones by Tuning in the Input
Space [44.97953547553997]
The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks.
We propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters.
arXiv Detail & Related papers (2023-06-05T15:11:59Z) - Variational operator learning: A unified paradigm marrying training
neural operators and solving partial differential equations [9.148052787201797]
We propose a novel paradigm that provides a unified framework of training neural operators and solving PDEs with the variational form.
With a label-free training set and a 5-label-only shift set, VOL learns solution operators with its test errors decreasing in a power law with respect to the amount of unlabeled data.
arXiv Detail & Related papers (2023-04-09T13:20:19Z) - Understanding and Improving Transfer Learning of Deep Models via Neural Collapse [37.483109067209504]
This work investigates the relationship between neural collapse (NC) and transfer learning for classification problems.
We find strong correlation between feature collapse and downstream performance.
Our proposed fine-tuning methods deliver good performances while reducing fine-tuning parameters by at least 90%.
arXiv Detail & Related papers (2022-12-23T08:48:34Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Learning the Travelling Salesperson Problem Requires Rethinking
Generalization [9.176056742068813]
End-to-end training of neural network solvers for graph optimization problems such as the Travelling Salesperson Problem (TSP) have seen a surge of interest recently.
While state-of-the-art learning-driven approaches perform closely to classical solvers when trained on trivially small sizes, they are unable to generalize the learnt policy to larger instances at practical scales.
This work presents an end-to-end neural optimization pipeline that unifies several recent papers in order to identify the principled biases, model architectures and learning algorithms that promote generalization to instances larger than those seen in training.
arXiv Detail & Related papers (2020-06-12T10:14:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.