TABASCO: A Fast, Simplified Model for Molecular Generation with Improved Physical Quality
- URL: http://arxiv.org/abs/2507.00899v1
- Date: Tue, 01 Jul 2025 16:01:58 GMT
- Title: TABASCO: A Fast, Simplified Model for Molecular Generation with Improved Physical Quality
- Authors: Carlos Vonessen, Charles Harris, Miruna Cretu, Pietro Liò,
- Abstract summary: TABASCO is a non-equivariant transformer model for 3D molecular generation.<n>It treats atoms in a molecule as sequences and reconstructs bonds deterministically after generation.<n>On the GEOM-Drugs benchmark TABASCO achieves state-of-the-art PoseBusters validity and delivers inference roughly 10x faster than the strongest baseline.
- Score: 15.030633864521562
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art models for 3D molecular generation are based on significant inductive biases, SE(3), permutation equivariance to respect symmetry and graph message-passing networks to capture local chemistry, yet the generated molecules still struggle with physical plausibility. We introduce TABASCO which relaxes these assumptions: The model has a standard non-equivariant transformer architecture, treats atoms in a molecule as sequences and reconstructs bonds deterministically after generation. The absence of equivariant layers and message passing allows us to significantly simplify the model architecture and scale data throughput. On the GEOM-Drugs benchmark TABASCO achieves state-of-the-art PoseBusters validity and delivers inference roughly 10x faster than the strongest baseline, while exhibiting emergent rotational equivariance despite symmetry not being hard-coded. Our work offers a blueprint for training minimalist, high-throughput generative models suited to specialised tasks such as structure- and pharmacophore-based drug design. We provide a link to our implementation at github.com/carlosinator/tabasco.
Related papers
- Sampling 3D Molecular Conformers with Diffusion Transformers [13.536503487456622]
Diffusion Transformers (DiTs) have demonstrated strong performance in generative modeling.<n>Applying DiTs to molecules introduces novel challenges, such as integrating discrete molecular graph information with continuous 3D geometry.<n>We propose DiTMC, a framework that adapts DiTs to address these challenges through a modular architecture.
arXiv Detail & Related papers (2025-06-18T11:47:59Z) - Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling [77.26556208024633]
3D molecule generation is crucial for drug discovery and material science.<n>Existing approaches typically maintain separate latent spaces for invariant and equivariant modalities.<n>We propose textbfUAE-3D, a multi-modal VAE that compresses 3D molecules into latent sequences from a unified latent space.
arXiv Detail & Related papers (2025-03-19T08:56:13Z) - Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - SemlaFlow -- Efficient 3D Molecular Generation with Latent Attention and Equivariant Flow Matching [43.56824843205882]
Semla is a scalable E(3)-equivariant message passing architecture.<n>SemlaFlow is trained to generate a joint distribution over atom types, coordinates, bond types and formal charges.<n>Our model produces state-of-the-art results on benchmark datasets with as few as 20 sampling steps.
arXiv Detail & Related papers (2024-06-11T13:51:51Z) - DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design [62.68420322996345]
Existing structured-based drug design methods treat all ligand atoms equally.
We propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold.
Our approach achieves state-of-the-art performance in generating high-affinity molecules.
arXiv Detail & Related papers (2024-02-26T05:21:21Z) - E(3)-equivariant models cannot learn chirality: Field-based molecular generation [51.327048911864885]
Chirality plays a key role in determining drug safety and potency.<n>We introduce a novel field-based representation, proposing reference rotations that replace rotational symmetry constraints.<n>The proposed model captures all molecular geometries including chirality, while still achieving highly competitive performance with E(3)-based methods across standard benchmarking metrics.
arXiv Detail & Related papers (2024-02-24T17:13:58Z) - Navigating the Design Space of Equivariant Diffusion-Based Generative
Models for De Novo 3D Molecule Generation [1.3124513975412255]
Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery.
We explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas.
We present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets.
arXiv Detail & Related papers (2023-09-29T14:53:05Z) - CoarsenConf: Equivariant Coarsening with Aggregated Attention for
Molecular Conformer Generation [3.31521245002301]
We introduce CoarsenConf, which integrates molecular graphs based on torsional angles into an SE(3)-equivariant hierarchical variational autoencoder.
Through equivariant coarse-graining, we aggregate the fine-grained atomic coordinates of subgraphs connected via rotatable bonds, creating a variable-length coarse-grained latent representation.
Our model uses a novel aggregated attention mechanism to restore fine-grained coordinates from the coarse-grained latent representation, enabling efficient generation of accurate conformers.
arXiv Detail & Related papers (2023-06-26T17:02:54Z) - Geometric Latent Diffusion Models for 3D Molecule Generation [172.15028281732737]
Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries.
We propose a novel and principled method for 3D molecule generation named Geometric Latent Diffusion Models (GeoLDM)
arXiv Detail & Related papers (2023-05-02T01:07:22Z) - 3D Equivariant Diffusion for Target-Aware Molecule Generation and
Affinity Prediction [9.67574543046801]
The inclusion of 3D structures during targeted drug design shows superior performance to other target-free models.
We develop a 3D equivariant diffusion model to solve the above challenges.
Our model could generate molecules with more realistic 3D structures and better affinities towards the protein targets, and improve binding affinity ranking and prediction without retraining.
arXiv Detail & Related papers (2023-03-06T23:01:43Z) - Structure-based Drug Design with Equivariant Diffusion Models [40.73626627266543]
We present DiffSBDD, an SE(3)-equivariant diffusion model that generates novel conditioned on protein pockets.
Our in silico experiments demonstrate that DiffSBDD captures the statistics of the ground truth data effectively.
These results support the assumption that diffusion models represent the complex distribution of structural data more accurately than previous methods.
arXiv Detail & Related papers (2022-10-24T15:51:21Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.