Speeding up and reducing memory usage for scientific machine learning
via mixed precision
- URL: http://arxiv.org/abs/2401.16645v1
- Date: Tue, 30 Jan 2024 00:37:57 GMT
- Title: Speeding up and reducing memory usage for scientific machine learning
via mixed precision
- Authors: Joel Hayford, Jacob Goldman-Wetzler, Eric Wang, and Lu Lu
- Abstract summary: Training neural networks for partial differential equations requires large amounts of memory and computational resources.
In search of computational efficiency, training neural networks using half precision (float16) has gained substantial interest.
We explore mixed precision, which combines the float16 and float32 numerical formats to reduce memory usage and increase computational speed.
Our experiments showcase that mixed precision training not only substantially decreases training times and memory demands but also maintains model accuracy.
- Score: 3.746841257785099
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific machine learning (SciML) has emerged as a versatile approach to
address complex computational science and engineering problems. Within this
field, physics-informed neural networks (PINNs) and deep operator networks
(DeepONets) stand out as the leading techniques for solving partial
differential equations by incorporating both physical equations and
experimental data. However, training PINNs and DeepONets requires significant
computational resources, including long computational times and large amounts
of memory. In search of computational efficiency, training neural networks
using half precision (float16) rather than the conventional single (float32) or
double (float64) precision has gained substantial interest, given the inherent
benefits of reduced computational time and memory consumed. However, we find
that float16 cannot be applied to SciML methods, because of gradient divergence
at the start of training, weight updates going to zero, and the inability to
converge to a local minima. To overcome these limitations, we explore mixed
precision, which is an approach that combines the float16 and float32 numerical
formats to reduce memory usage and increase computational speed. Our
experiments showcase that mixed precision training not only substantially
decreases training times and memory demands but also maintains model accuracy.
We also reinforce our empirical observations with a theoretical analysis. The
research has broad implications for SciML in various computational
applications.
Related papers
- Guaranteed Approximation Bounds for Mixed-Precision Neural Operators [83.64404557466528]
We build on intuition that neural operator learning inherently induces an approximation error.
We show that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
arXiv Detail & Related papers (2023-07-27T17:42:06Z) - NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with
Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger.
It decomposing the original learning tasks into several coarser-resolution subtasks.
We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z) - Deep learning applied to computational mechanics: A comprehensive
review, state of the art, and the classics [77.34726150561087]
Recent developments in artificial neural networks, particularly deep learning (DL), are reviewed in detail.
Both hybrid and pure machine learning (ML) methods are discussed.
History and limitations of AI are recounted and discussed, with particular attention at pointing out misstatements or misconceptions of the classics.
arXiv Detail & Related papers (2022-12-18T02:03:00Z) - Is Integer Arithmetic Enough for Deep Learning Training? [2.9136421025415205]
replacing floating-point arithmetic with low-bit integer arithmetic is a promising approach to save energy, memory footprint, and latency of deep learning models.
We propose a fully functional integer training pipeline including forward pass, back-propagation, and gradient descent.
Our experimental results show that our proposed method is effective in a wide variety of tasks such as classification (including vision transformers), object detection, and semantic segmentation.
arXiv Detail & Related papers (2022-07-18T22:36:57Z) - Deep Neural Networks to Correct Sub-Precision Errors in CFD [0.0]
Several machine learning techniques have been successful in correcting the errors arising from spatial discretization.
We employ a Convolutional Neural Network together with a fully differentiable numerical solver performing 16-bit arithmetic to learn a tightly-coupled ML-CFD hybrid solver.
Compared to the 16-bit solver, we demonstrate the efficacy of the ML-CFD hybrid solver towards reducing the error accumulation in the velocity field and improving the kinetic energy spectrum at higher frequencies.
arXiv Detail & Related papers (2022-02-09T02:32:40Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - PositNN: Training Deep Neural Networks with Mixed Low-Precision Posit [5.534626267734822]
The presented research aims to evaluate the feasibility to train deep convolutional neural networks using posits.
A software framework was developed to use simulated posits and quires in end-to-end training and inference.
Results suggest that 8-bit posits can substitute 32-bit floats during training with no negative impact on the resulting loss and accuracy.
arXiv Detail & Related papers (2021-04-30T19:30:37Z) - Reduced Precision Strategies for Deep Learning: A High Energy Physics
Generative Adversarial Network Use Case [0.19788841311033123]
A promising approach to make deep learning more efficient is to quantize the parameters of the neural networks to reduced precision.
In this paper we analyse the effects of low precision inference on a complex deep generative adversarial network model.
arXiv Detail & Related papers (2021-03-18T10:20:23Z) - Ps and Qs: Quantization-aware pruning for efficient low latency neural
network inference [56.24109486973292]
We study the interplay between pruning and quantization during the training of neural networks for ultra low latency applications.
We find that quantization-aware pruning yields more computationally efficient models than either pruning or quantization alone for our task.
arXiv Detail & Related papers (2021-02-22T19:00:05Z) - Large-scale Neural Solvers for Partial Differential Equations [48.7576911714538]
Solving partial differential equations (PDE) is an indispensable part of many branches of science as many processes can be modelled in terms of PDEs.
Recent numerical solvers require manual discretization of the underlying equation as well as sophisticated, tailored code for distributed computing.
We examine the applicability of continuous, mesh-free neural solvers for partial differential equations, physics-informed neural networks (PINNs)
We discuss the accuracy of GatedPINN with respect to analytical solutions -- as well as state-of-the-art numerical solvers, such as spectral solvers.
arXiv Detail & Related papers (2020-09-08T13:26:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.