Bioptic -- A Target-Agnostic Potency-Based Small Molecules Search Engine
- URL: http://arxiv.org/abs/2406.14572v3
- Date: Mon, 1 Jul 2024 01:33:10 GMT
- Title: Bioptic -- A Target-Agnostic Potency-Based Small Molecules Search Engine
- Authors: Vlad Vinogradov, Ivan Izmailov, Simon Steshin, Kong T. Nguyen,
- Abstract summary: We develop a target-agnostic, efficacy-based molecule search model.
We screen the ultra-large 40B Enamine REAL library with 100% recall rate.
We benchmarked our model and several state-of-the-art models for both speed performance and retrieval quality of novel molecules.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent successes in virtual screening have been made possible by large models and extensive chemical libraries. However, combining these elements is challenging: the larger the model, the more expensive it is to run, making ultra-large libraries unfeasible. To address this, we developed a target-agnostic, efficacy-based molecule search model, which allows us to find structurally dissimilar molecules with similar biological activities. We used the best practices to design fast retrieval system, based on processor-optimized SIMD instructions, enabling us to screen the ultra-large 40B Enamine REAL library with 100\% recall rate. We extensively benchmarked our model and several state-of-the-art models for both speed performance and retrieval quality of novel molecules.
Related papers
- Generative Modeling of Molecular Dynamics Trajectories [12.255021091552441]
We introduce generative modeling of molecular trajectories as a paradigm for learning flexible multi-task surrogate models of MD from data.
We show such generative models can be adapted to diverse tasks such as forward simulation, transition path sampling, and trajectory upsampling.
arXiv Detail & Related papers (2024-09-26T13:02:28Z) - RGFN: Synthesizable Molecular Generation Using GFlowNets [51.33672611338754]
We propose Reaction-GFlowNet, an extension of the GFlowNet framework that operates directly in the space of chemical reactions.
RGFN allows out-of-the-box synthesizability while maintaining comparable quality of generated candidates.
We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.
arXiv Detail & Related papers (2024-06-01T13:11:11Z) - Guided Multi-objective Generative AI to Enhance Structure-based Drug Design [0.0]
We describe IDOLpro, a generative chemistry AI combining diffusion with multi-objective optimization for structure-based drug design.
IDOLpro produces with binding affinities over 10%-20% better than the next best state-of-the-art method on each test set.
We show that IDOLpro can generate molecules for a range of important disease-related targets with better binding affinity and synthetic accessibility than any molecule found in the virtual screen.
arXiv Detail & Related papers (2024-05-20T05:08:55Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Molecular Fingerprints for Robust and Efficient ML-Driven Molecular
Generation [0.0]
We propose a novel molecular fingerprint-based variational autoencoder applied for molecular generation on real-world drug molecules.
We observe a substantial improvement in chemical synthetic accessibility ($DeltabarSAS$ = -0.83) and in computational efficiency up to 5.9x in comparison to an existing state-of-the-art SMILES-based architecture.
arXiv Detail & Related papers (2022-11-16T18:07:43Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - FastFlows: Flow-Based Models for Molecular Graph Generation [4.9252608053969675]
FastFlows generates thousands of chemically valid molecules in seconds.
Our model is significantly simpler and easier to train than autoregressive molecular generative models.
arXiv Detail & Related papers (2022-01-28T21:08:31Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - Hybrid modeling: Applications in real-time diagnosis [64.5040763067757]
We outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models.
We are using such models for real-time diagnosis applications.
arXiv Detail & Related papers (2020-03-04T00:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.