Towards Training Billion Parameter Graph Neural Networks for Atomic
Simulations
- URL: http://arxiv.org/abs/2203.09697v1
- Date: Fri, 18 Mar 2022 01:54:34 GMT
- Title: Towards Training Billion Parameter Graph Neural Networks for Atomic
Simulations
- Authors: Anuroop Sriram, Abhishek Das, Brandon M. Wood, Siddharth Goyal, C.
Lawrence Zitnick
- Abstract summary: We introduce Graph Parallelism, a method to distribute input graphs across multiple GPUs, enabling us to train very large GNNs with hundreds of millions or billions of parameters.
On the large-scale Open Catalyst 2020 dataset, these graph-parallelized models lead to relative improvements of 1) 15% on the force MAE metric for the S2EF task and 2) 21% on the AFbT metric for the IS2RS task.
- Score: 11.328193255838986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent progress in Graph Neural Networks (GNNs) for modeling atomic
simulations has the potential to revolutionize catalyst discovery, which is a
key step in making progress towards the energy breakthroughs needed to combat
climate change. However, the GNNs that have proven most effective for this task
are memory intensive as they model higher-order interactions in the graphs such
as those between triplets or quadruplets of atoms, making it challenging to
scale these models. In this paper, we introduce Graph Parallelism, a method to
distribute input graphs across multiple GPUs, enabling us to train very large
GNNs with hundreds of millions or billions of parameters. We empirically
evaluate our method by scaling up the number of parameters of the recently
proposed DimeNet++ and GemNet models by over an order of magnitude. On the
large-scale Open Catalyst 2020 (OC20) dataset, these graph-parallelized models
lead to relative improvements of 1) 15% on the force MAE metric for the S2EF
task and 2) 21% on the AFbT metric for the IS2RS task, establishing new
state-of-the-art results.
Related papers
- Scalable Training of Trustworthy and Energy-Efficient Predictive Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN [5.386946356430465]
We develop and train scalable, trustworthy, and energy-efficient predictive graph foundation models (GFMs) using HydraGNN.
HydraGNN expands the boundaries of graph neural network (GNN) computations in both training scale and data diversity.
Our GFMs use multi-task learning (MTL) to simultaneously learn graph-level and node-level properties of atomistic structures.
arXiv Detail & Related papers (2024-06-12T21:21:42Z) - On the Scalability of GNNs for Molecular Graphs [7.402389334892391]
Graph Neural Networks (GNNs) are yet to show the benefits of scale due to the lower efficiency of sparse operations, large data requirements, and lack of clarity about the effectiveness of various architectures.
We analyze message-passing networks, graph Transformers, and hybrid architectures on the largest public collection of 2D molecular graphs.
For the first time, we observe that GNNs benefit tremendously from the increasing scale of depth, width, number of molecules, number of labels, and the diversity in the pretraining datasets.
arXiv Detail & Related papers (2024-04-17T17:11:31Z) - Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints.
A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism.
We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z) - Training Graph Neural Networks on Growing Stochastic Graphs [114.75710379125412]
Graph Neural Networks (GNNs) rely on graph convolutions to exploit meaningful patterns in networked data.
We propose to learn GNNs on very large graphs by leveraging the limit object of a sequence of growing graphs, the graphon.
arXiv Detail & Related papers (2022-10-27T16:00:45Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization [23.609017952951454]
We propose SCARA, a scalable Graph Neural Network (GNN) with feature-oriented optimization for graph computation.
SCARA efficiently computes graph embedding from node features, and further selects and reuses feature results to reduce overhead.
It is efficient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111M nodes, 1.6B edges) in 100 seconds.
arXiv Detail & Related papers (2022-07-19T10:32:11Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors
to Sequences [55.329402218608365]
We propose the Neighbor2Seq to transform the hierarchical neighborhood of each node into a sequence.
We evaluate our method on a massive graph with more than 111 million nodes and 1.6 billion edges.
Results show that our proposed method is scalable to massive graphs and achieves superior performance across massive and medium-scale graphs.
arXiv Detail & Related papers (2022-02-07T16:38:36Z) - Rotation Invariant Graph Neural Networks using Spin Convolutions [28.4962005849904]
Machine learning approaches have the potential to approximate Density Functional Theory (DFT) in a computationally efficient manner.
We introduce a novel approach to modeling angular information between sets of neighboring atoms in a graph neural network.
Results are demonstrated on the large-scale Open Catalyst 2020 dataset.
arXiv Detail & Related papers (2021-06-17T14:59:34Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.