Towards DNA-Encoded Library Generation with GFlowNets
- URL: http://arxiv.org/abs/2404.10094v1
- Date: Mon, 15 Apr 2024 19:01:20 GMT
- Title: Towards DNA-Encoded Library Generation with GFlowNets
- Authors: MichaĆ Koziarski, Mohammed Abukalam, Vedant Shah, Louis Vaillancourt, Doris Alexandra Schuetz, Moksh Jain, Almer van der Sloot, Mathieu Bourgey, Anne Marinier, Yoshua Bengio,
- Abstract summary: One of the key challenges in using DELs is library design.
In this paper we consider the task of protein-protein interaction (PPI) biased DEL.
We evaluate several machine learning algorithms on the modulation task and use them as a reward for the proposed GFlowNet-based generative approach.
- Score: 35.09890349911668
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: DNA-encoded libraries (DELs) are a powerful approach for rapidly screening large numbers of diverse compounds. One of the key challenges in using DELs is library design, which involves choosing the building blocks that will be combinatorially combined to produce the final library. In this paper we consider the task of protein-protein interaction (PPI) biased DEL design. To this end, we evaluate several machine learning algorithms on the PPI modulation task and use them as a reward for the proposed GFlowNet-based generative approach. We additionally investigate the possibility of using structural information about building blocks to design a hierarchical action space for the GFlowNet. The observed results indicate that GFlowNets are a promising approach for generating diverse combinatorial library candidates.
Related papers
- Improving GFlowNets with Monte Carlo Tree Search [6.497027864860203]
Recent studies have revealed strong connections between GFlowNets and entropy-regularized reinforcement learning.
We propose to enhance planning capabilities of GFlowNets by applying Monte Carlo Tree Search (MCTS)
Our experiments demonstrate that this approach improves the sample efficiency of GFlowNet training and the generation fidelity of pre-trained GFlowNet models.
arXiv Detail & Related papers (2024-06-19T15:58:35Z) - RGFN: Synthesizable Molecular Generation Using GFlowNets [51.33672611338754]
We propose Reaction-GFlowNet, an extension of the GFlowNet framework that operates directly in the space of chemical reactions.
RGFN allows out-of-the-box synthesizability while maintaining comparable quality of generated candidates.
We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.
arXiv Detail & Related papers (2024-06-01T13:11:11Z) - Let the Flows Tell: Solving Graph Combinatorial Optimization Problems
with GFlowNets [86.43523688236077]
Combinatorial optimization (CO) problems are often NP-hard and out of reach for exact algorithms.
GFlowNets have emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially.
In this paper, we design Markov decision processes (MDPs) for different problems and propose to train conditional GFlowNets to sample from the solution space.
arXiv Detail & Related papers (2023-05-26T15:13:09Z) - torchgfn: A PyTorch GFlowNet library [56.071033896777784]
torchgfn is a PyTorch library that aims to address this need.
It provides users with a simple API for environments and useful abstractions for samplers and losses.
arXiv Detail & Related papers (2023-05-24T00:20:59Z) - An efficient graph generative model for navigating ultra-large
combinatorial synthesis libraries [1.5495593104596397]
Virtual, make-on-demand chemical libraries have transformed early-stage drug discovery by unlocking vast, synthetically accessible regions of chemical space.
Recent years have witnessed rapid growth in these libraries from millions to trillions of compounds, hiding undiscovered, potent hits for a variety of therapeutic targets.
We propose the Combinatorial Synthesis Library Variational Auto-Encoder (CSLVAE) to overcome these challenges.
arXiv Detail & Related papers (2022-10-19T15:43:13Z) - GFlowCausal: Generative Flow Networks for Causal Discovery [27.51595081346858]
We propose a novel approach to learning a Directed Acyclic Graph (DAG) from observational data called GFlowCausal.
GFlowCausal aims to learn the best policy to generate high-reward DAGs by sequential actions with probabilities proportional to predefined rewards.
We conduct extensive experiments on both synthetic and real datasets, and results show the proposed approach to be superior and also performs well in a large-scale setting.
arXiv Detail & Related papers (2022-10-15T04:07:39Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - GPflux: A Library for Deep Gaussian Processes [31.207566616050574]
GPflux is a Python library for Bayesian deep learning with a strong emphasis on deep Gaussian processes (DGPs)
It is compatible with and built on top of the Keras deep learning eco-system.
GPflux relies on GPflow for most of its GP objects and operations, which makes it an efficient, modular and extendable library.
arXiv Detail & Related papers (2021-04-12T17:41:18Z) - Torch-Struct: Deep Structured Prediction Library [138.5262350501951]
We introduce Torch-Struct, a library for structured prediction.
Torch-Struct includes a broad collection of probabilistic structures accessed through a simple and flexible distribution-based API.
arXiv Detail & Related papers (2020-02-03T16:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.