Analyzing the Performance of Deep Encoder-Decoder Networks as Surrogates
for a Diffusion Equation
- URL: http://arxiv.org/abs/2302.03786v1
- Date: Tue, 7 Feb 2023 22:53:19 GMT
- Title: Analyzing the Performance of Deep Encoder-Decoder Networks as Surrogates
for a Diffusion Equation
- Authors: J. Quetzalcoatl Toledo-Marin, James A. Glazier, Geoffrey Fox
- Abstract summary: We study the use of encoder-decoder convolutional neural network (CNN) as surrogates for steady-state diffusion solvers.
Our results indicate that increasing the size of the training set has a substantial effect on reducing performance fluctuations and overall error.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks (NNs) have proven to be a viable alternative to traditional
direct numerical algorithms, with the potential to accelerate computational
time by several orders of magnitude. In the present paper we study the use of
encoder-decoder convolutional neural network (CNN) as surrogates for
steady-state diffusion solvers. The construction of such surrogates requires
the selection of an appropriate task, network architecture, training set
structure and size, loss function, and training algorithm hyperparameters. It
is well known that each of these factors can have a significant impact on the
performance of the resultant model. Our approach employs an encoder-decoder CNN
architecture, which we posit is particularly well-suited for this task due to
its ability to effectively transform data, as opposed to merely compressing it.
We systematically evaluate a range of loss functions, hyperparameters, and
training set sizes. Our results indicate that increasing the size of the
training set has a substantial effect on reducing performance fluctuations and
overall error. Additionally, we observe that the performance of the model
exhibits a logarithmic dependence on the training set size. Furthermore, we
investigate the effect on model performance by using different subsets of data
with varying features. Our results highlight the importance of sampling the
configurational space in an optimal manner, as this can have a significant
impact on the performance of the model and the required training time. In
conclusion, our results suggest that training a model with a pre-determined
error performance bound is not a viable approach, as it does not guarantee that
edge cases with errors larger than the bound do not exist. Furthermore, as most
surrogate tasks involve a high dimensional landscape, an ever increasing
training set size is, in principle, needed, however it is not a practical
solution.
Related papers
- A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - A Generic Performance Model for Deep Learning in a Distributed
Environment [0.7829352305480285]
We propose a generic performance model of an application in a distributed environment with a generic expression of the application execution time.
We have evaluated the proposed model on three deep learning frameworks (i.e., MXnet, and Pytorch)
arXiv Detail & Related papers (2023-05-19T13:30:34Z) - Dual adaptive training of photonic neural networks [30.86507809437016]
Photonic neural network (PNN) computes with photons instead of electrons to feature low latency, high energy efficiency, and high parallelism.
Existing training approaches cannot address the extensive accumulation of systematic errors in large-scale PNNs.
We propose dual adaptive training ( DAT) that allows the PNN model to adapt to substantial systematic errors.
arXiv Detail & Related papers (2022-12-09T05:03:45Z) - LegoNet: A Fast and Exact Unlearning Architecture [59.49058450583149]
Machine unlearning aims to erase the impact of specific training samples upon deleted requests from a trained model.
We present a novel network, namely textitLegoNet, which adopts the framework of fixed encoder + multiple adapters''
We show that LegoNet accomplishes fast and exact unlearning while maintaining acceptable performance, synthetically outperforming unlearning baselines.
arXiv Detail & Related papers (2022-10-28T09:53:05Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - SpaceNet: Make Free Space For Continual Learning [15.914199054779438]
We propose a novel architectural-based method referred as SpaceNet for class incremental learning scenario.
SpaceNet trains sparse deep neural networks from scratch in an adaptive way that compresses the sparse connections of each task in a compact number of neurons.
Experimental results show the robustness of our proposed method against catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing the available capacity of the model.
arXiv Detail & Related papers (2020-07-15T11:21:31Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z) - A Survey on Impact of Transient Faults on BNN Inference Accelerators [0.9667631210393929]
Big data booming enables us to easily access and analyze the highly large data sets.
Deep learning models require significant computation power and extremely high memory accesses.
In this study, we demonstrate that the impact of soft errors on a customized deep learning algorithm might cause drastic image misclassification.
arXiv Detail & Related papers (2020-04-10T16:15:55Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.