Mini-data-driven Deep Arbitrary Polynomial Chaos Expansion for
Uncertainty Quantification
- URL: http://arxiv.org/abs/2107.10428v1
- Date: Thu, 22 Jul 2021 02:49:07 GMT
- Title: Mini-data-driven Deep Arbitrary Polynomial Chaos Expansion for
Uncertainty Quantification
- Authors: Xiaohu Zheng, Jun Zhang, Ning Wang, Guijian Tang, Wen Yao
- Abstract summary: This paper proposes a deep arbitrary chaos expansion (Deep aPCE) method to improve the balance between surrogate model accuracy and training data cost.
Four numerical examples and an actual engineering problem are used to verify the effectiveness of the Deep aPCE method.
- Score: 9.586968666707529
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The surrogate model-based uncertainty quantification method has drawn a lot
of attention in recent years. Both the polynomial chaos expansion (PCE) and the
deep learning (DL) are powerful methods for building a surrogate model.
However, the PCE needs to increase the expansion order to improve the accuracy
of the surrogate model, which causes more labeled data to solve the expansion
coefficients, and the DL also needs a lot of labeled data to train the neural
network model. This paper proposes a deep arbitrary polynomial chaos expansion
(Deep aPCE) method to improve the balance between surrogate model accuracy and
training data cost. On the one hand, the multilayer perceptron (MLP) model is
used to solve the adaptive expansion coefficients of arbitrary polynomial chaos
expansion, which can improve the Deep aPCE model accuracy with lower expansion
order. On the other hand, the adaptive arbitrary polynomial chaos expansion's
properties are used to construct the MLP training cost function based on only a
small amount of labeled data and a large scale of non-labeled data, which can
significantly reduce the training data cost. Four numerical examples and an
actual engineering problem are used to verify the effectiveness of the Deep
aPCE method.
Related papers
- Pushing the Limits of Large Language Model Quantization via the Linearity Theorem [71.3332971315821]
We present a "line theoremarity" establishing a direct relationship between the layer-wise $ell$ reconstruction error and the model perplexity increase due to quantization.
This insight enables two novel applications: (1) a simple data-free LLM quantization method using Hadamard rotations and MSE-optimal grids, dubbed HIGGS, and (2) an optimal solution to the problem of finding non-uniform per-layer quantization levels.
arXiv Detail & Related papers (2024-11-26T15:35:44Z) - Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Federated Bayesian Deep Learning: The Application of Statistical Aggregation Methods to Bayesian Models [0.9940108090221528]
Aggregation strategies have been developed to pool or fuse the weights and biases of distributed deterministic models.
We show that simple application of the aggregation methods associated with FL schemes for deterministic models is either impossible or results in sub-optimal performance.
arXiv Detail & Related papers (2024-03-22T15:02:24Z) - REMEDI: Corrective Transformations for Improved Neural Entropy Estimation [0.7488108981865708]
We introduce $textttREMEDI$ for efficient and accurate estimation of differential entropy.
Our approach demonstrates improvement across a broad spectrum of estimation tasks.
It can be naturally extended to information theoretic supervised learning models.
arXiv Detail & Related papers (2024-02-08T14:47:37Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - Quantized Adaptive Subgradient Algorithms and Their Applications [39.103587572626026]
We propose quantized composite mirror descent adaptive subgradient (QCMD adagrad) and quantized regularized dual average adaptive subgradient (QRDA adagrad) for distributed training.
A quantized gradient-based adaptive learning rate matrix is constructed to achieve a balance between communication costs, accuracy, and model sparsity.
arXiv Detail & Related papers (2022-08-11T04:04:03Z) - Parameterized Consistency Learning-based Deep Polynomial Chaos Neural
Network Method for Reliability Analysis in Aerospace Engineering [3.541245871465521]
Polynomial chaos expansion (PCE) is a powerful surrogate model reliability analysis method in aerospace engineering.
To alleviate this problem, this paper proposes a parameterized consistency learning-based deep chaos neural network (Deep PCNN) method.
The Deep PCNN method can significantly reduce the training data cost in constructing a high-order PCE model.
arXiv Detail & Related papers (2022-03-29T15:15:12Z) - Deep Probabilistic Graphical Modeling [2.2691593216516863]
This thesis develops deep probabilistic graphical modeling (DPGM)
DPGM consists in leveraging deep learning (DL) to make PGM more flexible.
One model class we develop extends exponential family PCA using neural networks to improve predictive performance.
arXiv Detail & Related papers (2021-04-25T03:48:02Z) - Adaptive Subcarrier, Parameter, and Power Allocation for Partitioned
Edge Learning Over Broadband Channels [69.18343801164741]
partitioned edge learning (PARTEL) implements parameter-server training, a well known distributed learning method, in wireless network.
We consider the case of deep neural network (DNN) models which can be trained using PARTEL by introducing some auxiliary variables.
arXiv Detail & Related papers (2020-10-08T15:27:50Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.