NEAT: Neighborhood-Guided, Efficient, Autoregressive Set Transformer for 3D Molecular Generation
- URL: http://arxiv.org/abs/2512.05844v1
- Date: Fri, 05 Dec 2025 16:18:07 GMT
- Title: NEAT: Neighborhood-Guided, Efficient, Autoregressive Set Transformer for 3D Molecular Generation
- Authors: Daniel Rose, Roxane Axel Jacob, Johannes Kirchmair, Thierry Langer,
- Abstract summary: We introduce NEAT, a Neighborhood-guided, Efficient, Autoregressive, Set Transformer that treats molecular graphs as sets of atoms.<n>NEAT approaches state-of-the-art performance in 3D molecular generation with high computational efficiency.
- Score: 3.0919057031368506
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autoregressive models are a promising alternative to diffusion-based models for 3D molecular structure generation. However, a key limitation is the assumption of a token order: while text has a natural sequential order, the next token prediction given a molecular graph prefix should be invariant to atom permutations. Previous works sidestepped this mismatch by using canonical orders or focus atoms. We argue that this is unnecessary. We introduce NEAT, a Neighborhood-guided, Efficient, Autoregressive, Set Transformer that treats molecular graphs as sets of atoms and learns the order-agnostic distribution over admissible tokens at the graph boundary with an autoregressive flow model. NEAT approaches state-of-the-art performance in 3D molecular generation with high computational efficiency and atom-level permutation invariance, establishing a practical foundation for scalable molecular design.
Related papers
- Molecular Representations in Implicit Functional Space via Hyper-Networks [53.70982267248536]
We argue that molecular learning can instead be formulated as learning in function space.<n>We instantiate this formulation with MolField, a hyper-network-based framework that learns distributions over molecular fields.<n>Our results show that treating molecules as continuous functions fundamentally changes how molecular representations generalize across tasks.
arXiv Detail & Related papers (2026-01-29T21:13:37Z) - InertialAR: Autoregressive 3D Molecule Generation with Inertial Frames [28.64470338973616]
InertialAR devises a canonical tokenization that aligns molecules to their inertial frames.<n>It also equips the attention mechanism with geometric rotary positional encoding (GeoRoPE)<n>InertialAR achieves state-of-the-art performance on 7 of the 10 evaluation metrics for unconditional molecule generation.
arXiv Detail & Related papers (2025-10-31T14:19:50Z) - Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling [90.23688195918432]
3D molecule generation is crucial for drug discovery and material science.<n>Existing approaches typically maintain separate latent spaces for invariant and equivariant modalities.<n>We propose textbfUAE-3D, a multi-modal VAE that compresses 3D molecules into latent sequences from a unified latent space.
arXiv Detail & Related papers (2025-03-19T08:56:13Z) - Learning-Order Autoregressive Models with Application to Molecular Graph Generation [52.44913282062524]
We introduce a variant of ARM that generates high-dimensional data using a probabilistic ordering that is sequentially inferred from data.<n>We demonstrate experimentally that our method can learn meaningful autoregressive orderings in image and graph generation.
arXiv Detail & Related papers (2025-03-07T23:24:24Z) - Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space [46.11163798008912]
We introduce a new framework for molecular graph generation with 3D molecular generative models.
Our framework maps molecular graphs to Euclidean point clouds via synthetic conformer coordinates and learns the inverse map using an E(n)-Equivariant Graph Neural Network (EGNN)
The induced point cloud-structured latent space is well-suited to apply existing 3D molecular generative models.
arXiv Detail & Related papers (2024-06-15T05:29:07Z) - BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning [11.862370962277938]
We present a novel generative model, BindGPT, which uses a conceptually simple but powerful approach to create 3D molecules within the protein's binding site.
We show how such simple conceptual approach combined with pretraining and scaling can perform on par or better than the current best specialized diffusion models.
arXiv Detail & Related papers (2024-06-06T02:10:50Z) - Pard: Permutation-Invariant Autoregressive Diffusion for Graph Generation [36.896680500652536]
We introduce a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods.<n>Pard achieves state-of-the-art performance on molecular and non-molecular datasets, and scales to large datasets like MOSES containing 1.9M molecules.
arXiv Detail & Related papers (2024-02-06T04:17:44Z) - MUDiff: Unified Diffusion for Complete Molecule Generation [104.7021929437504]
We present a new model for generating a comprehensive representation of molecules, including atom features, 2D discrete molecule structures, and 3D continuous molecule coordinates.
We propose a novel graph transformer architecture to denoise the diffusion process.
Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.
arXiv Detail & Related papers (2023-04-28T04:25:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.