Related papers: SE(3)-Hyena Operator for Scalable Equivariant Learning

SE(3)-Hyena Operator for Scalable Equivariant Learning

URL: http://arxiv.org/abs/2407.01049v2
Date: Tue, 13 Aug 2024 15:06:41 GMT
Title: SE(3)-Hyena Operator for Scalable Equivariant Learning
Authors: Artem Moskalev, Mangal Prakash, Rui Liao, Tommaso Mansi,
Abstract summary: We introduce SE(3)-Hyena, an equivariant long-convolutional model based on the Hyena operator. Our model processes the geometric context of 20k tokens x3.5 times faster than the equivariant transformer.
Score: 5.354533854744212
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modeling global geometric context while maintaining equivariance is crucial for accurate predictions in many fields such as biology, chemistry, or vision. Yet, this is challenging due to the computational demands of processing high-dimensional data at scale. Existing approaches such as equivariant self-attention or distance-based message passing, suffer from quadratic complexity with respect to sequence length, while localized methods sacrifice global information. Inspired by the recent success of state-space and long-convolutional models, in this work, we introduce SE(3)-Hyena operator, an equivariant long-convolutional model based on the Hyena operator. The SE(3)-Hyena captures global geometric context at sub-quadratic complexity while maintaining equivariance to rotations and translations. Evaluated on equivariant associative recall and n-body modeling, SE(3)-Hyena matches or outperforms equivariant self-attention while requiring significantly less memory and computational resources for long sequences. Our model processes the geometric context of 20k tokens x3.5 times faster than the equivariant transformer and allows x175 longer a context within the same memory budget.

Related papers

Geometric Hyena Networks for Large-scale Equivariant Learning [7.878315628263448]
We introduce Geometric Hyena, the first equivariant long-convolutional model for geometric systems.<n>Our model processes the geometric context of 30k tokens 20x faster than the equivariant transformer.
arXiv Detail & Related papers (2025-05-28T16:38:35Z)
Geometry-Informed Neural Operator Transformer [0.8906214436849201]
This work introduces the Geometry-Informed Neural Operator Transformer (GINOT), which integrates the transformer architecture with the neural operator framework to enable forward predictions for arbitrary geometries. The performance of GINOT is validated on multiple challenging datasets, showcasing its high accuracy and strong generalization capabilities for complex and arbitrary 2D and 3D geometries.
arXiv Detail & Related papers (2025-04-28T03:39:27Z)
Large Language-Geometry Model: When LLM meets Equivariance [53.8505081745406]
We propose EquiLLM, a novel framework for representing 3D physical systems. We show that EquiLLM delivers significant improvements over previous methods across molecular dynamics simulation, human motion simulation, and antibody design.
arXiv Detail & Related papers (2025-02-16T14:50:49Z)
Efficient Continuous Group Convolutions for Local SE(3) Equivariance in 3D Point Clouds [5.659343611352998]
We present an efficient, continuous, and local SE(3) equivariant convolution layer for point cloud processing. Our approach achieves competitive or superior performance across a range of datasets and tasks, including object classification and semantic segmentation.
arXiv Detail & Related papers (2025-02-11T12:15:56Z)
Geometric Algebra Planes: Convex Implicit Neural Volumes [70.12234371845445]
We show that GA-Planes is equivalent to a sparse low-rank factor plus low-resolution matrix. We also show that GA-Planes can be adapted for many existing representations.
arXiv Detail & Related papers (2024-11-20T18:21:58Z)
Does equivariance matter at scale? [15.247352029530523]
We study how equivariant and non-equivariant networks scale with compute and training samples. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation can close this gap given sufficient epochs. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget.
arXiv Detail & Related papers (2024-10-30T16:36:59Z)
Relaxed Equivariance via Multitask Learning [7.905957228045955]
We introduce REMUL, a training procedure for approximating equivariance with multitask learning. We show that unconstrained models can learn approximate symmetries by minimizing an additional simple equivariance loss. Our method achieves competitive performance compared to equivariant baselines while being $10 times$ faster at inference and $2.5 times$ at training.
arXiv Detail & Related papers (2024-10-23T13:50:27Z)
Approximately Equivariant Neural Processes [47.14384085714576]
When modelling real-world data, learning problems are often not exactly equivariant, but only approximately. Current approaches to achieving this cannot usually be applied out-of-the-box to any architecture and symmetry group. We develop a general approach to achieving this using existing equivariant architectures.
arXiv Detail & Related papers (2024-06-19T12:17:14Z)
Hyena Hierarchy: Towards Larger Convolutional Language Models [115.82857881546089]
Hyena is a subquadratic drop-in replacement for attention constructed by interleaving implicitly parametrized long convolutions and data-controlled gating. In recall and reasoning tasks on sequences of thousands to hundreds of thousands of tokens, Hyena improves accuracy by more than 50 points over operators relying on state-spaces and other implicit and explicit methods.
arXiv Detail & Related papers (2023-02-21T18:29:25Z)
The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures. We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities. For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z)
Design equivariant neural networks for 3D point cloud [0.0]
This work seeks to improve the generalization and robustness of existing neural networks for 3D point clouds. The main challenge when designing equivariant models for point clouds is how to trade-off the performance of the model and the complexity. The proposed procedure is general and forms a fundamental approach to group equivariant neural networks.
arXiv Detail & Related papers (2022-05-02T02:57:13Z)
Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers. We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z)
Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. We show that FA-based models have maximal expressive power in a broad setting. We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z)
The Effects of Invertibility on the Representational Complexity of Encoders in Variational Autoencoders [16.27499951949733]
We show that if the generative map is "strongly invertible" (in a sense we suitably formalize), the inferential model need not be much more complex. Importantly, we do not require the generative model to be layerwise invertible. We provide theoretical support for the empirical wisdom that learning deep generative models is harder when data lies on a low-dimensional manifold.
arXiv Detail & Related papers (2021-07-09T19:53:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.