Residual Multi-Fidelity Neural Network Computing
- URL: http://arxiv.org/abs/2310.03572v3
- Date: Fri, 20 Dec 2024 18:17:44 GMT
- Title: Residual Multi-Fidelity Neural Network Computing
- Authors: Owen Davis, Mohammad Motamed, Raul Tempone,
- Abstract summary: We consider the general problem of constructing a neural network surrogate model using multi-fidelity information.
Motivated by error-complexity estimates for ReLU neural networks, we formulate the correlation between an inexpensive low-fidelity model and an expensive high-fidelity model.
We present four numerical examples to demonstrate the power of the proposed framework.
- Score: 0.0
- License:
- Abstract: In this work, we consider the general problem of constructing a neural network surrogate model using multi-fidelity information. Motivated by error-complexity estimates for ReLU neural networks, we formulate the correlation between an inexpensive low-fidelity model and an expensive high-fidelity model as a possibly non-linear residual function. This function defines a mapping between 1) the shared input space of the models along with the low-fidelity model output, and 2) the discrepancy between the outputs of the two models. The computational framework proceeds by training two neural networks to work in concert. The first network learns the residual function on a small set of high- and low-fidelity data. Once trained, this network is used to generate additional synthetic high-fidelity data, which is used in the training of the second network. The trained second network then acts as our surrogate for the high-fidelity quantity of interest. We present four numerical examples to demonstrate the power of the proposed framework, showing that significant savings in computational cost may be achieved when the output predictions are desired to be accurate within small tolerances.
Related papers
- Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models [0.34592277400656235]
Multi-fidelity machine learning methods integrate scarce, resource-intensive high-fidelity data with abundant but less accurate low-fidelity data.
We propose a practical multi-fidelity strategy for problems spanning low- and high-dimensional domains.
arXiv Detail & Related papers (2024-07-21T10:40:50Z) - Multi-Fidelity Bayesian Neural Network for Uncertainty Quantification in Transonic Aerodynamic Loads [0.0]
This paper implements a multi-fidelity Bayesian neural network model that applies transfer learning to fuse data generated by models at different fidelities.
The results demonstrate that the multi-fidelity Bayesian model outperforms the state-of-the-art Co-Kriging in terms of overall accuracy and robustness on unseen data.
arXiv Detail & Related papers (2024-07-08T07:34:35Z) - Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning [81.0108753452546]
We propose Dynamic Reversible Dual-Residual Networks, or Dr$2$Net, to finetune a pretrained model with substantially reduced memory consumption.
Dr$2$Net contains two types of residual connections, one maintaining the residual structure in the pretrained models, and the other making the network reversible.
We show that Dr$2$Net can reach comparable performance to conventional finetuning but with significantly less memory usage.
arXiv Detail & Related papers (2024-01-08T18:59:31Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Mitigating Performance Saturation in Neural Marked Point Processes:
Architectures and Loss Functions [50.674773358075015]
We propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers.
We show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.
arXiv Detail & Related papers (2021-07-07T16:59:14Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Neural Network Training Using $\ell_1$-Regularization and Bi-fidelity
Data [0.0]
We study the effects of sparsity promoting $ell_$-regularization on training neural networks when only a small training dataset from a high-fidelity model is available.
We consider two variants of $ell_$-regularization informed by the parameters of an identical network trained using data from lower-fidelity models of the problem at hand.
These bifidelity strategies are generalizations of transfer learning of neural networks that uses the parameters learned from a large low-fidelity dataset to efficiently train networks for a small high-fidelity dataset.
arXiv Detail & Related papers (2021-05-27T08:56:17Z) - Multi-fidelity regression using artificial neural networks: efficient
approximation of parameter-dependent output quantities [0.17499351967216337]
We present the use of artificial neural networks applied to multi-fidelity regression problems.
The introduced models are compared against a traditional multi-fidelity scheme, co-kriging.
We also show an application of multi-fidelity regression to an engineering problem.
arXiv Detail & Related papers (2021-02-26T11:29:00Z) - On transfer learning of neural networks using bi-fidelity data for
uncertainty propagation [0.0]
We explore the application of transfer learning techniques using training data generated from both high- and low-fidelity models.
In the former approach, a neural network model mapping the inputs to the outputs of interest is trained based on the low-fidelity data.
The high-fidelity data is then used to adapt the parameters of the upper layer(s) of the low-fidelity network, or train a simpler neural network to map the output of the low-fidelity network to that of the high-fidelity model.
arXiv Detail & Related papers (2020-02-11T15:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.