Kernel-Elastic Autoencoder for Molecular Design
- URL: http://arxiv.org/abs/2310.08685v2
- Date: Sat, 23 Mar 2024 18:52:44 GMT
- Title: Kernel-Elastic Autoencoder for Molecular Design
- Authors: Haote Li, Yu Shee, Brandon Allen, Federica Maschietto, Victor Batista,
- Abstract summary: Kernel-Elastic Autoencoder (KAE) is a self-supervised generative model based on the transformer architecture with enhanced performance for molecular design.
KAE achieves remarkable diversity in molecule generation while maintaining near-perfect reconstructions.
We anticipate KAE could be applied to solve problems by generation in a wide range of applications.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We introduce the Kernel-Elastic Autoencoder (KAE), a self-supervised generative model based on the transformer architecture with enhanced performance for molecular design. KAE is formulated based on two novel loss functions: modified maximum mean discrepancy and weighted reconstruction. KAE addresses the long-standing challenge of achieving valid generation and accurate reconstruction at the same time. KAE achieves remarkable diversity in molecule generation while maintaining near-perfect reconstructions on the independent testing dataset, surpassing previous molecule-generating models. KAE enables conditional generation and allows for decoding based on beam search resulting in state-of-the-art performance in constrained optimizations. Furthermore, KAE can generate molecules conditional to favorable binding affinities in docking applications as confirmed by AutoDock Vina and Glide scores, outperforming all existing candidates from the training dataset. Beyond molecular design, we anticipate KAE could be applied to solve problems by generation in a wide range of applications.
Related papers
- Integrating Predictive and Generative Capabilities by Latent Space Design via the DKL-VAE Model [0.22795086293129713]
We introduce a framework that integrates the generative power of a Variational Autoencoder (VAE) with the predictive nature of Deep Kernel Learning (DKL)
VAE learns a latent representation of high-dimensional data, enabling the generation of novel structures.
DKL refines this latent space by structuring it in alignment with target properties through Gaussian Process (GP) regression.
arXiv Detail & Related papers (2025-03-04T20:05:04Z) - Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design [87.58981407469977]
We propose a novel framework for inference-time reward optimization with diffusion models inspired by evolutionary algorithms.
Our approach employs an iterative refinement process consisting of two steps in each iteration: noising and reward-guided denoising.
arXiv Detail & Related papers (2025-02-20T17:48:45Z) - Cliqueformer: Model-Based Optimization with Structured Transformers [102.55764949282906]
Large neural networks excel at prediction tasks, but their application to design problems, such as protein engineering or materials discovery, requires solving offline model-based optimization (MBO) problems.
We present Cliqueformer, a transformer-based architecture that learns the black-box function's structure through functional graphical models (FGM)
Across various domains, including chemical and genetic design tasks, Cliqueformer demonstrates superior performance compared to existing methods.
arXiv Detail & Related papers (2024-10-17T00:35:47Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Scoreformer: A Surrogate Model For Large-Scale Prediction of Docking Scores [0.0]
We present ScoreFormer, a novel graph transformer model designed to accurately predict molecular docking scores.
ScoreFormer achieves competitive performance in docking score prediction and offers a substantial 1.65-fold reduction in inference time compared to existing models.
arXiv Detail & Related papers (2024-06-13T17:31:02Z) - Enhancing Generative Molecular Design via Uncertainty-guided Fine-tuning of Variational Autoencoders [2.0701439270461184]
A critical challenge for pre-trained generative molecular design models is to fine-tune them to be better suited for downstream design tasks.
In this work, we propose a novel approach for a generative uncertainty decoder (VAE)-based GMD model through performance feedback in an active setting.
arXiv Detail & Related papers (2024-05-31T02:00:25Z) - DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization [49.85944390503957]
DecompOpt is a structure-based molecular optimization method based on a controllable and diffusion model.
We show that DecompOpt can efficiently generate molecules with improved properties than strong de novo baselines.
arXiv Detail & Related papers (2024-03-07T02:53:40Z) - Molecule Design by Latent Prompt Transformer [76.2112075557233]
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task.
We propose a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
arXiv Detail & Related papers (2024-02-27T03:33:23Z) - Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs [50.25683648762602]
We introduce Koopman VAE, a new generative framework that is based on a novel design for the model prior.
Inspired by Koopman theory, we represent the latent conditional prior dynamics using a linear map.
KoVAE outperforms state-of-the-art GAN and VAE methods across several challenging synthetic and real-world time series generation benchmarks.
arXiv Detail & Related papers (2023-10-04T07:14:43Z) - CoarsenConf: Equivariant Coarsening with Aggregated Attention for
Molecular Conformer Generation [3.31521245002301]
We introduce CoarsenConf, which integrates molecular graphs based on torsional angles into an SE(3)-equivariant hierarchical variational autoencoder.
Through equivariant coarse-graining, we aggregate the fine-grained atomic coordinates of subgraphs connected via rotatable bonds, creating a variable-length coarse-grained latent representation.
Our model uses a novel aggregated attention mechanism to restore fine-grained coordinates from the coarse-grained latent representation, enabling efficient generation of accurate conformers.
arXiv Detail & Related papers (2023-06-26T17:02:54Z) - Conditional deep generative models as surrogates for spatial field
solution reconstruction with quantified uncertainty in Structural Health
Monitoring applications [0.0]
In problems related to Structural Health Monitoring (SHM), models capable of both handling high-dimensional data and quantifying uncertainty are required.
We propose a conditional deep generative model as a surrogate aimed at such applications and high-dimensional structural simulations in general.
The model is able to achieve high reconstruction accuracy compared to the reference Finite Element (FE) solutions, while at the same time successfully encoding the load uncertainty.
arXiv Detail & Related papers (2023-02-14T20:13:24Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Score-Based Generative Models for Molecule Generation [0.8808021343665321]
We train a Transformer-based score function on representations of 1.5 million samples from the ZINC dataset.
We use the Moses benchmarking framework to evaluate the generated samples on a suite of metrics.
arXiv Detail & Related papers (2022-03-07T13:46:02Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.