Score-Based Generative Models for Molecule Generation
- URL: http://arxiv.org/abs/2203.04698v1
- Date: Mon, 7 Mar 2022 13:46:02 GMT
- Title: Score-Based Generative Models for Molecule Generation
- Authors: Dwaraknath Gnaneshwar, Bharath Ramsundar, Dhairya Gandhi, Rachel
Kurchin, Venkatasubramanian Viswanathan
- Abstract summary: We train a Transformer-based score function on representations of 1.5 million samples from the ZINC dataset.
We use the Moses benchmarking framework to evaluate the generated samples on a suite of metrics.
- Score: 0.8808021343665321
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in generative models have made exploring design spaces easier
for de novo molecule generation. However, popular generative models like GANs
and normalizing flows face challenges such as training instabilities due to
adversarial training and architectural constraints, respectively. Score-based
generative models sidestep these challenges by modelling the gradient of the
log probability density using a score function approximation, as opposed to
modelling the density function directly, and sampling from it using annealed
Langevin Dynamics. We believe that score-based generative models could open up
new opportunities in molecule generation due to their architectural
flexibility, such as replacing the score function with an SE(3) equivariant
model. In this work, we lay the foundations by testing the efficacy of
score-based models for molecule generation. We train a Transformer-based score
function on Self-Referencing Embedded Strings (SELFIES) representations of 1.5
million samples from the ZINC dataset and use the Moses benchmarking framework
to evaluate the generated samples on a suite of metrics.
Related papers
- Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - Flow Generator Matching [35.371071097381346]
Flow Generator Matching (FGM) is designed to accelerate the sampling of flow-matching models into a one-step generation.
On the CIFAR10 unconditional generation benchmark, our one-step FGM model achieves a new record Fr'echet Inception Distance (FID) score of 3.08.
MM-DiT-FGM one-step text-to-image model demonstrates outstanding industry-level performance.
arXiv Detail & Related papers (2024-10-25T05:41:28Z) - Embedding-based statistical inference on generative models [10.948308354932639]
We extend results related to embedding-based representations of generative models to classical statistical inference settings.
We demonstrate that using the perspective space as the basis of a notion of "similar" is effective for multiple model-level inference tasks.
arXiv Detail & Related papers (2024-10-01T22:28:39Z) - Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models.
We propose a simple and effective iterative training method called MIx Source and pseudo Target.
Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Score-Based Generative Modeling through Stochastic Differential
Equations [114.39209003111723]
We present a differential equation that transforms a complex data distribution to a known prior distribution by injecting noise.
A corresponding reverse-time SDE transforms the prior distribution back into the data distribution by slowly removing the noise.
By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks.
We demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
arXiv Detail & Related papers (2020-11-26T19:39:10Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Variational Mixture of Normalizing Flows [0.0]
Deep generative models, such as generative adversarial networks autociteGAN, variational autoencoders autocitevaepaper, and their variants, have seen wide adoption for the task of modelling complex data distributions.
Normalizing flows have overcome this limitation by leveraging the change-of-suchs formula for probability density functions.
The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model.
arXiv Detail & Related papers (2020-09-01T17:20:08Z) - Improving Molecular Design by Stochastic Iterative Target Augmentation [38.44457632751997]
Generative models in molecular design tend to be richly parameterized, data-hungry neural models.
We propose a surprisingly effective self-training approach for iteratively creating additional molecular targets.
Our approach outperforms the previous state-of-the-art in conditional molecular design by over 10% in absolute gain.
arXiv Detail & Related papers (2020-02-11T22:40:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.