Related papers: Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs

Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs

URL: http://arxiv.org/abs/2206.11990v1
Date: Thu, 23 Jun 2022 21:40:37 GMT
Title: Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs
Authors: Yi-Lun Liao and Tess Smidt
Abstract summary: 3D-related inductive biases are indispensable to graph neural networks operating on 3D atomistic graphs such as molecules. Inspired by the success of Transformers in various domains, we study how to incorporate these inductive biases into Transformers. We present Equiformer, a graph neural network leveraging the strength of Transformer architectures.
Score: 3.0603554929274908
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D-related inductive biases like translational invariance and rotational equivariance are indispensable to graph neural networks operating on 3D atomistic graphs such as molecules. Inspired by the success of Transformers in various domains, we study how to incorporate these inductive biases into Transformers. In this paper, we present Equiformer, a graph neural network leveraging the strength of Transformer architectures and incorporating $SE(3)/E(3)$-equivariant features based on irreducible representations (irreps). Irreps features encode equivariant information in channel dimensions without complicating graph structures. The simplicity enables us to directly incorporate them by replacing original operations with equivariant counterparts. Moreover, to better adapt Transformers to 3D graphs, we propose a novel equivariant graph attention, which considers both content and geometric information such as relative position contained in irreps features. To improve expressivity of the attention, we replace dot product attention with multi-layer perceptron attention and include non-linear message passing. We benchmark Equiformer on two quantum properties prediction datasets, QM9 and OC20. For QM9, among models trained with the same data partition, Equiformer achieves best results on 11 out of 12 regression tasks. For OC20, under the setting of training with IS2RE data and optionally IS2RS data, Equiformer improves upon state-of-the-art models. Code reproducing all main results will be available soon.

Related papers

Equivariant Volumetric Grasping [2.9144754050161503]
We propose a new volumetric grasp model that is equivariant to rotations around the vertical axis.<n>Our results demonstrate that the proposed projection-based design significantly reduces both computational and memory costs.
arXiv Detail & Related papers (2025-07-24T23:18:32Z)
GITO: Graph-Informed Transformer Operator for Learning Complex Partial Differential Equations [0.0]
We present a novel graph-informed transformer operator (GITO) architecture for learning complex partial differential equation systems.<n>GITO consists of two main modules: a hybrid graph transformer (HGT) and a transformer neural operator (TNO)<n> Empirical results on benchmark PDE tasks demonstrate that GITO outperforms existing transformer-based neural operators.
arXiv Detail & Related papers (2025-06-16T18:35:45Z)
Simplifying Graph Transformers [64.50059165186701]
We propose three simple modifications to the plain Transformer to render it applicable to graphs without introducing major architectural distortions. Specifically, we advocate for the use of (1) simplified $L$ attention to measure the magnitude of closeness tokens; (2) adaptive root-mean-square normalization to preserve token magnitude information; and (3) a relative positional encoding bias with a shared encoder.
arXiv Detail & Related papers (2025-04-17T02:06:50Z)
S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR) Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z)
Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers [26.11060210663556]
We propose the Triplet Graph Transformer (TGT) that enables direct communication between pairs within a 3-tuple of nodes. TGT is applied to molecular property prediction by first predicting interatomic distances from 2D graphs and then using these distances for downstream tasks.
arXiv Detail & Related papers (2024-02-07T02:53:06Z)
B-cos Alignment for Inherently Interpretable CNNs and Vision Transformers [97.75725574963197]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. We show that a sequence of such transformations induces a single linear transformation that faithfully summarises the full model computations. We show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
arXiv Detail & Related papers (2023-06-19T12:54:28Z)
Transformers over Directed Acyclic Graphs [6.263470141349622]
We study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs. We show that it is effective in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency.
arXiv Detail & Related papers (2022-10-24T12:04:52Z)
The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures. We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities. For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z)
Pure Transformers are Powerful Graph Learners [51.36884247453605]
We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. We prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers. Our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results.
arXiv Detail & Related papers (2022-07-06T08:13:06Z)
Transformer for Graphs: An Overview from Architecture Perspective [86.3545861392215]
It's imperative to sort out the existing Transformer models for graphs and systematically investigate their effectiveness on various graph tasks. We first disassemble the existing models and conclude three typical ways to incorporate the graph information into the vanilla Transformer. Our experiments confirm the benefits of current graph-specific modules on Transformer and reveal their advantages on different kinds of graph tasks.
arXiv Detail & Related papers (2022-02-17T06:02:06Z)
Gophormer: Ego-Graph Transformer for Node Classification [27.491500255498845]
In this paper, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs. Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability. In order to handle the uncertainty introduced by the ego-graph sampling, we propose a consistency regularization and a multi-sample inference strategy.
arXiv Detail & Related papers (2021-10-25T16:43:32Z)
Nonlinearities in Steerable SO(2)-Equivariant CNNs [7.552100672006172]
We apply harmonic distortion analysis to illuminate the effect of nonlinearities on representations of SO(2). We develop a novel FFT-based algorithm for computing representations of non-linearly transformed activations. In experiments with 2D and 3D data, we obtain results that compare favorably to the state-of-the-art in terms of accuracy while continuous symmetry and exact equivariance.
arXiv Detail & Related papers (2021-09-14T17:53:45Z)
SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks [71.55002934935473]
We introduce the SE(3)-Transformer, a variant of the self-attention module for 3D point clouds and graphs, which is equivariant under continuous 3D roto-translations. We evaluate our model on a toy N-body particle simulation dataset, showcasing the robustness of the predictions under rotations of the input.
arXiv Detail & Related papers (2020-06-18T13:23:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.