Related papers: Score-Based Generative Models for Molecule Generation

Score-Based Generative Models for Molecule Generation

URL: http://arxiv.org/abs/2203.04698v1
Date: Mon, 7 Mar 2022 13:46:02 GMT
Title: Score-Based Generative Models for Molecule Generation
Authors: Dwaraknath Gnaneshwar, Bharath Ramsundar, Dhairya Gandhi, Rachel Kurchin, Venkatasubramanian Viswanathan
Abstract summary: We train a Transformer-based score function on representations of 1.5 million samples from the ZINC dataset. We use the Moses benchmarking framework to evaluate the generated samples on a suite of metrics.
Score: 0.8808021343665321
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in generative models have made exploring design spaces easier for de novo molecule generation. However, popular generative models like GANs and normalizing flows face challenges such as training instabilities due to adversarial training and architectural constraints, respectively. Score-based generative models sidestep these challenges by modelling the gradient of the log probability density using a score function approximation, as opposed to modelling the density function directly, and sampling from it using annealed Langevin Dynamics. We believe that score-based generative models could open up new opportunities in molecule generation due to their architectural flexibility, such as replacing the score function with an SE(3) equivariant model. In this work, we lay the foundations by testing the efficacy of score-based models for molecule generation. We train a Transformer-based score function on Self-Referencing Embedded Strings (SELFIES) representations of 1.5 million samples from the ZINC dataset and use the Moses benchmarking framework to evaluate the generated samples on a suite of metrics.

Related papers

Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z)
Flow Generator Matching [35.371071097381346]
Flow Generator Matching (FGM) is designed to accelerate the sampling of flow-matching models into a one-step generation. On the CIFAR10 unconditional generation benchmark, our one-step FGM model achieves a new record Fr'echet Inception Distance (FID) score of 3.08. MM-DiT-FGM one-step text-to-image model demonstrates outstanding industry-level performance.
arXiv Detail & Related papers (2024-10-25T05:41:28Z)
Embedding-based statistical inference on generative models [10.948308354932639]
We extend results related to embedding-based representations of generative models to classical statistical inference settings. We demonstrate that using the perspective space as the basis of a notion of "similar" is effective for multiple model-level inference tasks.
arXiv Detail & Related papers (2024-10-01T22:28:39Z)
Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation [0.0]
Flow matching is a recently proposed generative modeling framework that generalizes diffusion models. We extend the flow matching framework to categorical data by constructing flows that are constrained to exist on a continuous representation of categorical data known as the probability simplex. We find that, in practice, a simpler approach that makes no accommodations for the categorical nature of the data yields equivalent or superior performance.
arXiv Detail & Related papers (2024-04-30T17:37:21Z)
Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models. We propose a simple and effective iterative training method called MIx Source and pseudo Target. Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
Score-Based Generative Modeling through Stochastic Differential Equations [114.39209003111723]
We present a differential equation that transforms a complex data distribution to a known prior distribution by injecting noise. A corresponding reverse-time SDE transforms the prior distribution back into the data distribution by slowly removing the noise. By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks. We demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
arXiv Detail & Related papers (2020-11-26T19:39:10Z)
Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously. We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework. The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
Variational Mixture of Normalizing Flows [0.0]
Deep generative models, such as generative adversarial networks autociteGAN, variational autoencoders autocitevaepaper, and their variants, have seen wide adoption for the task of modelling complex data distributions. Normalizing flows have overcome this limitation by leveraging the change-of-suchs formula for probability density functions. The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model.
arXiv Detail & Related papers (2020-09-01T17:20:08Z)
Improving Molecular Design by Stochastic Iterative Target Augmentation [38.44457632751997]
Generative models in molecular design tend to be richly parameterized, data-hungry neural models. We propose a surprisingly effective self-training approach for iteratively creating additional molecular targets. Our approach outperforms the previous state-of-the-art in conditional molecular design by over 10% in absolute gain.
arXiv Detail & Related papers (2020-02-11T22:40:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.