Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic
Graphs
- URL: http://arxiv.org/abs/2206.11990v1
- Date: Thu, 23 Jun 2022 21:40:37 GMT
- Title: Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic
Graphs
- Authors: Yi-Lun Liao and Tess Smidt
- Abstract summary: 3D-related inductive biases are indispensable to graph neural networks operating on 3D atomistic graphs such as molecules.
Inspired by the success of Transformers in various domains, we study how to incorporate these inductive biases into Transformers.
We present Equiformer, a graph neural network leveraging the strength of Transformer architectures.
- Score: 3.0603554929274908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D-related inductive biases like translational invariance and rotational
equivariance are indispensable to graph neural networks operating on 3D
atomistic graphs such as molecules. Inspired by the success of Transformers in
various domains, we study how to incorporate these inductive biases into
Transformers. In this paper, we present Equiformer, a graph neural network
leveraging the strength of Transformer architectures and incorporating
$SE(3)/E(3)$-equivariant features based on irreducible representations
(irreps). Irreps features encode equivariant information in channel dimensions
without complicating graph structures. The simplicity enables us to directly
incorporate them by replacing original operations with equivariant
counterparts. Moreover, to better adapt Transformers to 3D graphs, we propose a
novel equivariant graph attention, which considers both content and geometric
information such as relative position contained in irreps features. To improve
expressivity of the attention, we replace dot product attention with
multi-layer perceptron attention and include non-linear message passing. We
benchmark Equiformer on two quantum properties prediction datasets, QM9 and
OC20. For QM9, among models trained with the same data partition, Equiformer
achieves best results on 11 out of 12 regression tasks. For OC20, under the
setting of training with IS2RE data and optionally IS2RS data, Equiformer
improves upon state-of-the-art models. Code reproducing all main results will
be available soon.
Related papers
- S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers [26.11060210663556]
We propose the Triplet Graph Transformer (TGT) that enables direct communication between pairs within a 3-tuple of nodes.
TGT is applied to molecular property prediction by first predicting interatomic distances from 2D graphs and then using these distances for downstream tasks.
arXiv Detail & Related papers (2024-02-07T02:53:06Z) - B-cos Alignment for Inherently Interpretable CNNs and Vision
Transformers [97.75725574963197]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training.
We show that a sequence of such transformations induces a single linear transformation that faithfully summarises the full model computations.
We show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
arXiv Detail & Related papers (2023-06-19T12:54:28Z) - Transformers over Directed Acyclic Graphs [6.263470141349622]
We study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs.
We show that it is effective in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency.
arXiv Detail & Related papers (2022-10-24T12:04:52Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Pure Transformers are Powerful Graph Learners [51.36884247453605]
We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice.
We prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers.
Our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results.
arXiv Detail & Related papers (2022-07-06T08:13:06Z) - Transformer for Graphs: An Overview from Architecture Perspective [86.3545861392215]
It's imperative to sort out the existing Transformer models for graphs and systematically investigate their effectiveness on various graph tasks.
We first disassemble the existing models and conclude three typical ways to incorporate the graph information into the vanilla Transformer.
Our experiments confirm the benefits of current graph-specific modules on Transformer and reveal their advantages on different kinds of graph tasks.
arXiv Detail & Related papers (2022-02-17T06:02:06Z) - Gophormer: Ego-Graph Transformer for Node Classification [27.491500255498845]
In this paper, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs.
Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability.
In order to handle the uncertainty introduced by the ego-graph sampling, we propose a consistency regularization and a multi-sample inference strategy.
arXiv Detail & Related papers (2021-10-25T16:43:32Z) - Nonlinearities in Steerable SO(2)-Equivariant CNNs [7.552100672006172]
We apply harmonic distortion analysis to illuminate the effect of nonlinearities on representations of SO(2).
We develop a novel FFT-based algorithm for computing representations of non-linearly transformed activations.
In experiments with 2D and 3D data, we obtain results that compare favorably to the state-of-the-art in terms of accuracy while continuous symmetry and exact equivariance.
arXiv Detail & Related papers (2021-09-14T17:53:45Z) - SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks [71.55002934935473]
We introduce the SE(3)-Transformer, a variant of the self-attention module for 3D point clouds and graphs, which is equivariant under continuous 3D roto-translations.
We evaluate our model on a toy N-body particle simulation dataset, showcasing the robustness of the predictions under rotations of the input.
arXiv Detail & Related papers (2020-06-18T13:23:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.