Improving the performance of Stein variational inference through extreme sparsification of physically-constrained neural network models
- URL: http://arxiv.org/abs/2407.00761v1
- Date: Sun, 30 Jun 2024 16:50:11 GMT
- Title: Improving the performance of Stein variational inference through extreme sparsification of physically-constrained neural network models
- Authors: Govinda Anantha Padmanabha, Jan Niklas Fuhg, Cosmin Safta, Reese E. Jones, Nikolaos Bouklas,
- Abstract summary: We show that $L_$ sparsification prior to Stein variational gradient descent ($L_$+SVGD) is a more robust and efficient means of uncertainty quantification.
Specifically, $L_$+SVGD demonstrates superior resilience to noise, the ability to perform well in extrapolated regions, and a faster convergence rate to an optimal solution.
- Score: 0.815557531820863
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Most scientific machine learning (SciML) applications of neural networks involve hundreds to thousands of parameters, and hence, uncertainty quantification for such models is plagued by the curse of dimensionality. Using physical applications, we show that $L_0$ sparsification prior to Stein variational gradient descent ($L_0$+SVGD) is a more robust and efficient means of uncertainty quantification, in terms of computational cost and performance than the direct application of SGVD or projected SGVD methods. Specifically, $L_0$+SVGD demonstrates superior resilience to noise, the ability to perform well in extrapolated regions, and a faster convergence rate to an optimal solution.
Related papers
- Fast Cell Library Characterization for Design Technology Co-Optimization Based on Graph Neural Networks [0.1752969190744922]
Design technology co-optimization (DTCO) plays a critical role in achieving optimal power, performance, and area.
We propose a graph neural network (GNN)-based machine learning model for rapid and accurate cell library characterization.
arXiv Detail & Related papers (2023-12-20T06:10:27Z) - Stochastic Gradient Langevin Dynamics Based on Quantization with
Increasing Resolution [0.0]
We propose an alternative descent learning equation based on quantized optimization for non- objective functions.
We demonstrate the effectiveness of the proposed on vanilla neural convolution neural(CNN) models and the architecture across various data sets.
arXiv Detail & Related papers (2023-05-30T08:55:59Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with
Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger.
It decomposing the original learning tasks into several coarser-resolution subtasks.
We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Physics-enhanced deep surrogates for partial differential equations [30.731686639510517]
We present a "physics-enhanced deep-surrogate" ("PEDS") approach towards developing fast surrogate models for complex physical systems.
Specifically, a combination of a low-fidelity, explainable physics simulator and a neural network generator is proposed, which is trained end-to-end to globally match the output of an expensive high-fidelity numerical solver.
arXiv Detail & Related papers (2021-11-10T18:43:18Z) - Differentially private training of neural networks with Langevin
dynamics forcalibrated predictive uncertainty [58.730520380312676]
We show that differentially private gradient descent (DP-SGD) can yield poorly calibrated, overconfident deep learning models.
This represents a serious issue for safety-critical applications, e.g. in medical diagnosis.
arXiv Detail & Related papers (2021-07-09T08:14:45Z) - Stabilizing Training of Generative Adversarial Nets via Langevin Stein
Variational Gradient Descent [11.329376606876101]
We propose to stabilize GAN training via a novel particle-based variational inference -- Langevin Stein variational descent gradient (LSVGD)
We show that LSVGD dynamics has an implicit regularization which is able to enhance particles' spread-out and diversity.
arXiv Detail & Related papers (2020-04-22T11:20:04Z) - Highly Efficient Salient Object Detection with 100K Parameters [137.74898755102387]
We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features.
We build an extremely light-weighted model, namely CSNet, which achieves comparable performance with about 0.2% (100k) of large models on popular object detection benchmarks.
arXiv Detail & Related papers (2020-03-12T07:00:46Z) - Stochasticity in Neural ODEs: An Empirical Study [68.8204255655161]
Regularization of neural networks (e.g. dropout) is a widespread technique in deep learning that allows for better generalization.
We show that data augmentation during the training improves the performance of both deterministic and versions of the same model.
However, the improvements obtained by the data augmentation completely eliminate the empirical regularization gains, making the performance of neural ODE and neural SDE negligible.
arXiv Detail & Related papers (2020-02-22T22:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.