InertialAR: Autoregressive 3D Molecule Generation with Inertial Frames
- URL: http://arxiv.org/abs/2510.27497v1
- Date: Fri, 31 Oct 2025 14:19:50 GMT
- Title: InertialAR: Autoregressive 3D Molecule Generation with Inertial Frames
- Authors: Haorui Li, Weitao Du, Yuqiang Li, Hongyu Guo, Shengchao Liu,
- Abstract summary: InertialAR devises a canonical tokenization that aligns molecules to their inertial frames.<n>It also equips the attention mechanism with geometric rotary positional encoding (GeoRoPE)<n>InertialAR achieves state-of-the-art performance on 7 of the 10 evaluation metrics for unconditional molecule generation.
- Score: 28.64470338973616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer-based autoregressive models have emerged as a unifying paradigm across modalities such as text and images, but their extension to 3D molecule generation remains underexplored. The gap stems from two fundamental challenges: (1) tokenizing molecules into a canonical 1D sequence of tokens that is invariant to both SE(3) transformations and atom index permutations, and (2) designing an architecture capable of modeling hybrid atom-based tokens that couple discrete atom types with continuous 3D coordinates. To address these challenges, we introduce InertialAR. InertialAR devises a canonical tokenization that aligns molecules to their inertial frames and reorders atoms to ensure SE(3) and permutation invariance. Moreover, InertialAR equips the attention mechanism with geometric awareness via geometric rotary positional encoding (GeoRoPE). In addition, it utilizes a hierarchical autoregressive paradigm to predict the next atom-based token, predicting the atom type first and then its 3D coordinates via Diffusion loss. Experimentally, InertialAR achieves state-of-the-art performance on 7 of the 10 evaluation metrics for unconditional molecule generation across QM9, GEOM-Drugs, and B3LYP. Moreover, it significantly outperforms strong baselines in controllable generation for targeted chemical functionality, attaining state-of-the-art results across all 5 metrics.
Related papers
- PRGCN: A Graph Memory Network for Cross-Sequence Pattern Reuse in 3D Human Pose Estimation [18.771349697842947]
This work introduces the Pattern Reuse Graph Conal Network (PRGCN), a novel framework that formalizes pose estimation as a problem of pattern retrieval and adaptation.<n>At its core, PRGCN features a graph memory bank that learns and stores a compact set of pose prototypes, encoded as relational graphs, which are dynamically retrieved via an attention mechanism to provide structured priors.<n>Our work posits that PRGCN establishes a new state-of-the-art, achieving an MPJPE of 37.1mm and 13.4mm, respectively, while exhibiting enhanced cross-domain generalization capability.
arXiv Detail & Related papers (2025-10-22T11:12:07Z) - Aligned Manifold Property and Topology Point Clouds for Learning Molecular Properties [55.2480439325792]
This work introduces AMPTCR, a molecular surface representation that combines local quantum-derived scalar fields and custom topological descriptors within an aligned point cloud format.<n>For molecular weight, results confirm that AMPTCR encodes physically meaningful data, with a validation R2 of 0.87.<n>In the bacterial inhibition task, AMPTCR enables both classification and direct regression of E. coli inhibition values.
arXiv Detail & Related papers (2025-07-22T04:35:50Z) - Sampling 3D Molecular Conformers with Diffusion Transformers [13.536503487456622]
Diffusion Transformers (DiTs) have demonstrated strong performance in generative modeling.<n>Applying DiTs to molecules introduces novel challenges, such as integrating discrete molecular graph information with continuous 3D geometry.<n>We propose DiTMC, a framework that adapts DiTs to address these challenges through a modular architecture.
arXiv Detail & Related papers (2025-06-18T11:47:59Z) - Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling [90.23688195918432]
3D molecule generation is crucial for drug discovery and material science.<n>Existing approaches typically maintain separate latent spaces for invariant and equivariant modalities.<n>We propose textbfUAE-3D, a multi-modal VAE that compresses 3D molecules into latent sequences from a unified latent space.
arXiv Detail & Related papers (2025-03-19T08:56:13Z) - Tokenizing 3D Molecule Structure with Quantized Spherical Coordinates [28.452581855002855]
Mol-StrucTok is a novel method for tokenizing 3D molecular structures.<n>We design a line notation for 3D molecules by extracting local atomic coordinates in a spherical coordinate system.<n>We employ a Vector Quantized Variational Autoencoder (VQ-VAE) to tokenize these coordinates, treating them as generation descriptors.
arXiv Detail & Related papers (2024-12-02T14:50:44Z) - E(3)-equivariant models cannot learn chirality: Field-based molecular generation [51.327048911864885]
Chirality plays a key role in determining drug safety and potency.<n>We introduce a novel field-based representation, proposing reference rotations that replace rotational symmetry constraints.<n>The proposed model captures all molecular geometries including chirality, while still achieving highly competitive performance with E(3)-based methods across standard benchmarking metrics.
arXiv Detail & Related papers (2024-02-24T17:13:58Z) - Diffusion-Driven Generative Framework for Molecular Conformation
Prediction [0.66567375919026]
The rapid advancement of machine learning has revolutionized the precision of predictive modeling in this context.
This research introduces a cutting-edge generative framework named method.
Method views atoms as discrete entities and excels in guiding the reversal of diffusion.
arXiv Detail & Related papers (2023-12-22T11:49:39Z) - CoarsenConf: Equivariant Coarsening with Aggregated Attention for
Molecular Conformer Generation [3.31521245002301]
We introduce CoarsenConf, which integrates molecular graphs based on torsional angles into an SE(3)-equivariant hierarchical variational autoencoder.
Through equivariant coarse-graining, we aggregate the fine-grained atomic coordinates of subgraphs connected via rotatable bonds, creating a variable-length coarse-grained latent representation.
Our model uses a novel aggregated attention mechanism to restore fine-grained coordinates from the coarse-grained latent representation, enabling efficient generation of accurate conformers.
arXiv Detail & Related papers (2023-06-26T17:02:54Z) - Molecular Geometry-aware Transformer for accurate 3D Atomic System
modeling [51.83761266429285]
We propose a novel Transformer architecture that takes nodes (atoms) and edges (bonds and nonbonding atom pairs) as inputs and models the interactions among them.
Moleformer achieves state-of-the-art on the initial state to relaxed energy prediction of OC20 and is very competitive in QM9 on predicting quantum chemical properties.
arXiv Detail & Related papers (2023-02-02T03:49:57Z) - Geometric Transformer for End-to-End Molecule Properties Prediction [92.28929858529679]
We introduce a Transformer-based architecture for molecule property prediction, which is able to capture the geometry of the molecule.
We modify the classical positional encoder by an initial encoding of the molecule geometry, as well as a learned gated self-attention mechanism.
arXiv Detail & Related papers (2021-10-26T14:14:40Z) - Learning 3D Representations of Molecular Chirality with Invariance to
Bond Rotations [2.17167311150369]
We design an SE(3)-invariant model that processes torsion angles of a 3D molecular conformer.
We test our model on four benchmarks: contrastive learning to distinguish conformers of different stereoisomers in a learned latent space, classification of chiral centers as R/S, prediction of how enantiomers rotate circularly polarized light, and ranking enantiomers by their docking scores in an enantiosensitive protein pocket.
arXiv Detail & Related papers (2021-10-08T21:25:47Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.