Initial Model Incorporation for Deep Learning FWI: Pretraining or Denormalization?
- URL: http://arxiv.org/abs/2506.05484v1
- Date: Thu, 05 Jun 2025 18:06:37 GMT
- Title: Initial Model Incorporation for Deep Learning FWI: Pretraining or Denormalization?
- Authors: Ruihua Chen, Bangyu Wu, Meng Li, Kai Yang,
- Abstract summary: Subsurface property neural network reparameterized full waveform inversion (FWI) has emerged as an effective unsupervised learning framework.<n>It updates the trainable neural network parameters instead of fine-tuning on the subsurface model directly.<n>Pretraining first regulates the neural networks' parameters by fitting the initial velocity model.<n>Denormalization directly adds the outputs of the network into the initial models without pretraining.
- Score: 7.032076746858836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Subsurface property neural network reparameterized full waveform inversion (FWI) has emerged as an effective unsupervised learning framework, which can invert stably with an inaccurate starting model. It updates the trainable neural network parameters instead of fine-tuning on the subsurface model directly. There are primarily two ways to embed the prior knowledge of the initial model into neural networks, that is, pretraining and denormalization. Pretraining first regulates the neural networks' parameters by fitting the initial velocity model; Denormalization directly adds the outputs of the network into the initial models without pretraining. In this letter, we systematically investigate the influence of the two ways of initial model incorporation for the neural network reparameterized FWI. We demonstrate that pretraining requires inverting the model perturbation based on a constant velocity value (mean) with a two-stage implementation. It leads to a complex workflow and inconsistency of objective functions in the two-stage process, causing the network parameters to become inactive and lose plasticity. Experimental results demonstrate that denormalization can simplify workflows, accelerate convergence, and enhance inversion accuracy compared with pretraining.
Related papers
- PreAdaptFWI: Pretrained-Based Adaptive Residual Learning for Full-Waveform Inversion Without Dataset Dependency [8.719356558714246]
Full-waveform inversion (FWI) is a method that utilizes seismic data to invert the physical parameters of subsurface media.<n>Due to its ill-posed nature, FWI is susceptible to getting trapped in local minima.<n>Various research efforts have attempted to combine neural networks with FWI to stabilize the inversion process.
arXiv Detail & Related papers (2025-02-17T15:30:17Z) - Initialization Matters: Unraveling the Impact of Pre-Training on Federated Learning [21.440470901377182]
Initializing with pre-trained models is becoming standard practice in machine learning.<n>We study the class of two-layer convolutional neural networks (CNNs) and provide bounds on the training error convergence and test error of such a network trained with FedAvg.
arXiv Detail & Related papers (2025-02-11T23:53:16Z) - MILP initialization for solving parabolic PDEs with PINNs [2.5932373010465364]
Physics-Informed Neural Networks (PINNs) are a powerful deep learning method capable of providing solutions and parameter estimations of physical systems.<n>Given the complexity of their neural network structure, the convergence speed is still limited compared to numerical methods.
arXiv Detail & Related papers (2025-01-27T15:46:38Z) - Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Epistemic Modeling Uncertainty of Rapid Neural Network Ensembles for
Adaptive Learning [0.0]
A new type of neural network is presented using the rapid neural network paradigm.
It is found that the proposed emulator embedded neural network trains near-instantaneously, typically without loss of prediction accuracy.
arXiv Detail & Related papers (2023-09-12T22:34:34Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - On feedforward control using physics-guided neural networks: Training
cost regularization and optimized initialization [0.0]
Performance of model-based feedforward controllers is typically limited by the accuracy of the inverse system dynamics model.
This paper proposes a regularization method via identified physical parameters.
It is validated on a real-life industrial linear motor, where it delivers better tracking accuracy and extrapolation.
arXiv Detail & Related papers (2022-01-28T12:51:25Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.