SE(3)-Hyena Operator for Scalable Equivariant Learning
- URL: http://arxiv.org/abs/2407.01049v1
- Date: Mon, 1 Jul 2024 07:56:48 GMT
- Title: SE(3)-Hyena Operator for Scalable Equivariant Learning
- Authors: Artem Moskalev, Mangal Prakash, Rui Liao, Tommaso Mansi,
- Abstract summary: We introduce SE(3)-Hyena, an equivariant long-convolutional model based on the Hyena operator.
Our model processes the geometric context of 20k tokens x3.5 times faster than the equivariant transformer.
- Score: 5.354533854744212
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modeling global geometric context while maintaining equivariance is crucial for accurate predictions in many fields such as biology, chemistry, or vision. Yet, this is challenging due to the computational demands of processing high-dimensional data at scale. Existing approaches such as equivariant self-attention or distance-based message passing, suffer from quadratic complexity with respect to sequence length, while localized methods sacrifice global information. Inspired by the recent success of state-space and long-convolutional models, in this work, we introduce SE(3)-Hyena operator, an equivariant long-convolutional model based on the Hyena operator. The SE(3)-Hyena captures global geometric context at sub-quadratic complexity while maintaining equivariance to rotations and translations. Evaluated on equivariant associative recall and n-body modeling, SE(3)-Hyena matches or outperforms equivariant self-attention while requiring significantly less memory and computational resources for long sequences. Our model processes the geometric context of 20k tokens x3.5 times faster than the equivariant transformer and allows x175 longer a context within the same memory budget.
Related papers
- Approximately Equivariant Neural Processes [47.14384085714576]
We consider the use of approximately equivariant architectures in neural processes.
We demonstrate the effectiveness of our approach on a number of synthetic and real-world regression experiments.
arXiv Detail & Related papers (2024-06-19T12:17:14Z) - Multivector Neurons: Better and Faster O(n)-Equivariant Clifford Graph Neural Networks [17.716680490388306]
In this work, we test a few novel message passing graph neural networks (GNNs) based on Clifford multivectors.
We push the state-of-the-art error on the N-body dataset to 0.0035; an 8% improvement over recent methods.
arXiv Detail & Related papers (2024-06-06T13:17:44Z) - LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory [63.41820940103348]
Self-attention mechanism's computational cost limits its practicality for long sequences.
We propose a new method called LongVQ to compress the global abstraction as a length-fixed codebook.
LongVQ effectively maintains dynamic global and local patterns, which helps to complement the lack of long-range dependency issues.
arXiv Detail & Related papers (2024-04-17T08:26:34Z) - Hyena Hierarchy: Towards Larger Convolutional Language Models [115.82857881546089]
Hyena is a subquadratic drop-in replacement for attention constructed by interleaving implicitly parametrized long convolutions and data-controlled gating.
In recall and reasoning tasks on sequences of thousands to hundreds of thousands of tokens, Hyena improves accuracy by more than 50 points over operators relying on state-spaces and other implicit and explicit methods.
arXiv Detail & Related papers (2023-02-21T18:29:25Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Design equivariant neural networks for 3D point cloud [0.0]
This work seeks to improve the generalization and robustness of existing neural networks for 3D point clouds.
The main challenge when designing equivariant models for point clouds is how to trade-off the performance of the model and the complexity.
The proposed procedure is general and forms a fundamental approach to group equivariant neural networks.
arXiv Detail & Related papers (2022-05-02T02:57:13Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
We show that FA-based models have maximal expressive power in a broad setting.
We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z) - The Effects of Invertibility on the Representational Complexity of
Encoders in Variational Autoencoders [16.27499951949733]
We show that if the generative map is "strongly invertible" (in a sense we suitably formalize), the inferential model need not be much more complex.
Importantly, we do not require the generative model to be layerwise invertible.
We provide theoretical support for the empirical wisdom that learning deep generative models is harder when data lies on a low-dimensional manifold.
arXiv Detail & Related papers (2021-07-09T19:53:29Z) - Equivariant Point Network for 3D Point Cloud Analysis [17.689949017410836]
We propose an effective and practical SE(3) (3D translation and rotation) equivariant network for point cloud analysis.
First, we present SE(3) separable point convolution, a novel framework that breaks down the 6D convolution into two separable convolutional operators.
Second, we introduce an attention layer to effectively harness the expressiveness of the equivariant features.
arXiv Detail & Related papers (2021-03-25T21:57:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.