Related papers: Enhancing Generative Molecular Design via Uncertainty-guided Fine-tuning of Variational Autoencoders

Enhancing Generative Molecular Design via Uncertainty-guided Fine-tuning of Variational Autoencoders

URL: http://arxiv.org/abs/2405.20573v1
Date: Fri, 31 May 2024 02:00:25 GMT
Title: Enhancing Generative Molecular Design via Uncertainty-guided Fine-tuning of Variational Autoencoders
Authors: A N M Nafiz Abeer, Sanket Jantre, Nathan M Urban, Byung-Jun Yoon,
Abstract summary: A critical challenge for pre-trained generative molecular design models is to fine-tune them to be better suited for downstream design tasks. In this work, we propose a novel approach for a generative uncertainty decoder (VAE)-based GMD model through performance feedback in an active setting.
Score: 2.0701439270461184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, deep generative models have been successfully adopted for various molecular design tasks, particularly in the life and material sciences. A critical challenge for pre-trained generative molecular design (GMD) models is to fine-tune them to be better suited for downstream design tasks aimed at optimizing specific molecular properties. However, redesigning and training an existing effective generative model from scratch for each new design task is impractical. Furthermore, the black-box nature of typical downstream tasks$\unicode{x2013}$such as property prediction$\unicode{x2013}$makes it nontrivial to optimize the generative model in a task-specific manner. In this work, we propose a novel approach for a model uncertainty-guided fine-tuning of a pre-trained variational autoencoder (VAE)-based GMD model through performance feedback in an active learning setting. The main idea is to quantify model uncertainty in the generative model, which is made efficient by working within a low-dimensional active subspace of the high-dimensional VAE parameters explaining most of the variability in the model's output. The inclusion of model uncertainty expands the space of viable molecules through decoder diversity. We then explore the resulting model uncertainty class via black-box optimization made tractable by low-dimensionality of the active subspace. This enables us to identify and leverage a diverse set of high-performing models to generate enhanced molecules. Empirical results across six target molecular properties, using multiple VAE-based generative models, demonstrate that our uncertainty-guided fine-tuning approach consistently outperforms the original pre-trained models.

Related papers

MolPIF: A Parameter Interpolation Flow Model for Molecule Generation [9.122719535880355]
We propose a novel Interpolation Flow model (named PIF) with detailed theoretical foundation, training, and inference procedures.<n>We then develop MolPIF for structure-based drug design, demonstrating its superior performance across diverse metrics compared to baselines.<n>This work validates the effectiveness of parameter-space-based generative modeling paradigm for molecules and offers new perspectives for model design.
arXiv Detail & Related papers (2025-07-18T09:15:35Z)
HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling [52.58723853697152]
We propose a Hybrid Architecture Distillation (HAD) approach for DNA sequence modeling.<n>We employ the NTv2-500M as the teacher model and devise a grouping masking strategy.<n>Compared to models with similar parameters, our model achieved excellent performance.
arXiv Detail & Related papers (2025-05-27T07:57:35Z)
Cliqueformer: Model-Based Optimization with Structured Transformers [102.55764949282906]
We develop a model that learns the structure of an MBO task and empirically leads to improved designs. We evaluate Cliqueformer on various tasks, ranging from high-dimensional black-box functions to real-world tasks of chemical and genetic design.
arXiv Detail & Related papers (2024-10-17T00:35:47Z)
Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models [54.132297393662654]
We introduce a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL. We demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models.
arXiv Detail & Related papers (2024-05-30T03:57:29Z)
Leveraging Active Subspaces to Capture Epistemic Model Uncertainty in Deep Generative Models for Molecular Design [2.0701439270461184]
generative molecular design models have seen fewer efforts on uncertainty quantification (UQ) due to computational challenges in Bayesian inference posed by their large number of parameters. In this work, we focus on the junction-tree variational autoencoder (JT-VAE), a popular model for generative molecular design, and address this issue by leveraging the low dimensional active subspace to capture the uncertainty in the model parameters.
arXiv Detail & Related papers (2024-04-30T21:10:51Z)
Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization. We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons. Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z)
Masked Particle Modeling on Sets: Towards Self-Supervised High Energy Physics Foundation Models [4.299997052226609]
Masked particle modeling (MPM) is a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs. We study the efficacy of the method in samples of high energy jets at collider physics experiments.
arXiv Detail & Related papers (2024-01-24T15:46:32Z)
Evaluating the diversity and utility of materials proposed by generative models [38.85523285991743]
We show how one state-of-the-art generative model, the physics-guided crystal generation model, can be used as part of the inverse design process. Our findings suggest how generative models might be improved to enable better inverse design.
arXiv Detail & Related papers (2023-08-09T14:42:08Z)
Revisiting Implicit Models: Sparsity Trade-offs Capability in Weight-tied Model for Vision Tasks [4.872984658007499]
Implicit models such as Deep Equilibrium Models (DEQs) have garnered significant attention in the community for their ability to train infinite layer models. We revisit the line of implicit models and trace them back to the original weight-tied models. Surprisingly, we observe that weight-tied models are more effective, stable, as well as efficient on vision tasks, compared to the DEQ variants.
arXiv Detail & Related papers (2023-07-16T11:45:35Z)
Molecule Design by Latent Space Energy-Based Modeling and Gradual Distribution Shifting [53.44684898432997]
Generation of molecules with desired chemical and biological properties is critical for drug discovery. We propose a probabilistic generative model to capture the joint distribution of molecules and their properties. Our method achieves very strong performances on various molecule design tasks.
arXiv Detail & Related papers (2023-06-09T03:04:21Z)
A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models. They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space. This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data. Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z)
Physics-Constrained Predictive Molecular Latent Space Discovery with Graph Scattering Variational Autoencoder [0.0]
We develop a molecular generative model based on variational inference and graph theory in the small data regime. The model's performance is evaluated by generating molecules with desired target properties.
arXiv Detail & Related papers (2020-09-29T09:05:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.