Distributed Multigrid Neural Solvers on Megavoxel Domains
- URL: http://arxiv.org/abs/2104.14538v1
- Date: Thu, 29 Apr 2021 17:53:22 GMT
- Title: Distributed Multigrid Neural Solvers on Megavoxel Domains
- Authors: Aditya Balu, Sergio Botelho, Biswajit Khara, Vinay Rao, Chinmay Hegde,
Soumik Sarkar, Santi Adavani, Adarsh Krishnamurthy, Baskar
Ganapathysubramanian
- Abstract summary: We consider distributed training of PDE solvers producing full field outputs.
A scalable framework is presented that integrates two distinct advances.
This approach is deployed to train a generalized 3D Poisson solver that scales well to predict output full-field solutions.
- Score: 27.412837974378597
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We consider the distributed training of large-scale neural networks that
serve as PDE solvers producing full field outputs. We specifically consider
neural solvers for the generalized 3D Poisson equation over megavoxel domains.
A scalable framework is presented that integrates two distinct advances. First,
we accelerate training a large model via a method analogous to the multigrid
technique used in numerical linear algebra. Here, the network is trained using
a hierarchy of increasing resolution inputs in sequence, analogous to the 'V',
'W', 'F', and 'Half-V' cycles used in multigrid approaches. In conjunction with
the multi-grid approach, we implement a distributed deep learning framework
which significantly reduces the time to solve. We show the scalability of this
approach on both GPU (Azure VMs on Cloud) and CPU clusters (PSC Bridges2). This
approach is deployed to train a generalized 3D Poisson solver that scales well
to predict output full-field solutions up to the resolution of 512x512x512 for
a high dimensional family of inputs.
Related papers
- Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method [7.723218675113336]
We employ a three-layer tanh neural network within the framework of the deep Ritz method to solve second-order elliptic equations.
We perform projected gradient descent to train the three-layer network and we establish its global convergence.
We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm.
arXiv Detail & Related papers (2024-05-19T05:07:09Z) - Transferable Neural Wavefunctions for Solids [5.219568203653524]
We show how to optimize a single ansatz across all of these variations.
We successfully transfer a network, pre-trained on 2x2x2 supercells of LiH, to 3x3x3 supercells.
arXiv Detail & Related papers (2024-05-13T09:59:59Z) - Solving the Discretised Multiphase Flow Equations with Interface
Capturing on Structured Grids Using Machine Learning Libraries [0.6299766708197884]
This paper solves the discretised multiphase flow equations using tools and methods from machine-learning libraries.
For the first time, finite element discretisations of multiphase flows can be solved using an approach based on (untrained) convolutional neural networks.
arXiv Detail & Related papers (2024-01-12T18:42:42Z) - Multigrid-Augmented Deep Learning Preconditioners for the Helmholtz
Equation using Compact Implicit Layers [7.56372030029358]
We present a deep learning-based iterative approach to solve the discrete heterogeneous Helmholtz equation for high wavenumbers.
We construct a multilevel U-Net-like encoder-solver CNN with an implicit layer on the coarsest grid of the U-Net, where convolution kernels are inverted.
Our architecture can be used to generalize over different slowness models of various difficulties and is efficient at solving for many right-hand sides per slowness model.
arXiv Detail & Related papers (2023-06-30T08:56:51Z) - On Optimizing the Communication of Model Parallelism [74.15423270435949]
We study a novel and important communication pattern in large-scale model-parallel deep learning (DL)
In cross-mesh resharding, a sharded tensor needs to be sent from a source device mesh to a destination device mesh.
We propose two contributions to address cross-mesh resharding: an efficient broadcast-based communication system, and an "overlapping-friendly" pipeline schedule.
arXiv Detail & Related papers (2022-11-10T03:56:48Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - Multigrid-augmented deep learning preconditioners for the Helmholtz
equation [4.18804572788063]
We present a data-driven approach to solve the discrete heterogeneous Helmholtz equation at high wavenumbers.
We combine classical iterative solvers with convolutional neural networks (CNNs) to form a preconditioner which is applied within a Krylov solver.
arXiv Detail & Related papers (2022-03-14T10:31:11Z) - ResNet-LDDMM: Advancing the LDDMM Framework Using Deep Residual Networks [86.37110868126548]
In this work, we make use of deep residual neural networks to solve the non-stationary ODE (flow equation) based on a Euler's discretization scheme.
We illustrate these ideas on diverse registration problems of 3D shapes under complex topology-preserving transformations.
arXiv Detail & Related papers (2021-02-16T04:07:13Z) - Implicit Convex Regularizers of CNN Architectures: Convex Optimization
of Two- and Three-Layer Networks in Polynomial Time [70.15611146583068]
We study training of Convolutional Neural Networks (CNNs) with ReLU activations.
We introduce exact convex optimization with a complexity with respect to the number of data samples, the number of neurons, and data dimension.
arXiv Detail & Related papers (2020-06-26T04:47:20Z) - Neural Networks are Convex Regularizers: Exact Polynomial-time Convex
Optimization Formulations for Two-layer Networks [70.15611146583068]
We develop exact representations of training two-layer neural networks with rectified linear units (ReLUs)
Our theory utilizes semi-infinite duality and minimum norm regularization.
arXiv Detail & Related papers (2020-02-24T21:32:41Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.