Related papers: ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery

ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery

URL: http://arxiv.org/abs/2211.03808v1
Date: Mon, 7 Nov 2022 19:00:05 GMT
Title: ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery
Authors: Andac Demir, Baris Coskunuzer, Ignacio Segovia-Dominguez, Yuzhou Chen, Yulia Gel, Bulent Kiziltan
Abstract summary: In computer-aided drug discovery (CADD), virtual screening is used for identifying the drug candidates that are most likely to bind to a molecular target in a large library of compounds. To address this problem, we developed a novel method using multi parameter persistence (MP) homology that produces topological fingerprints of the compounds as multidimensional vectors. We show that the margin loss fine-tuning of pretrained Triplet networks attains highly competitive results in differentiating between compounds in the embedding space and ranking their likelihood of becoming effective drug candidates.
Score: 8.620443111346523
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In computer-aided drug discovery (CADD), virtual screening (VS) is used for identifying the drug candidates that are most likely to bind to a molecular target in a large library of compounds. Most VS methods to date have focused on using canonical compound representations (e.g., SMILES strings, Morgan fingerprints) or generating alternative fingerprints of the compounds by training progressively more complex variational autoencoders (VAEs) and graph neural networks (GNNs). Although VAEs and GNNs led to significant improvements in VS performance, these methods suffer from reduced performance when scaling to large virtual compound datasets. The performance of these methods has shown only incremental improvements in the past few years. To address this problem, we developed a novel method using multiparameter persistence (MP) homology that produces topological fingerprints of the compounds as multidimensional vectors. Our primary contribution is framing the VS process as a new topology-based graph ranking problem by partitioning a compound into chemical substructures informed by the periodic properties of its atoms and extracting their persistent homology features at multiple resolution levels. We show that the margin loss fine-tuning of pretrained Triplet networks attains highly competitive results in differentiating between compounds in the embedding space and ranking their likelihood of becoming effective drug candidates. We further establish theoretical guarantees for the stability properties of our proposed MP signatures, and demonstrate that our models, enhanced by the MP signatures, outperform state-of-the-art methods on benchmark datasets by a wide and highly statistically significant margin (e.g., 93% gain for Cleves-Jain and 54% gain for DUD-E Diverse dataset).

Related papers

Heterophily-informed Message Passing [16.73251866177758]
Graph neural networks (GNNs) are known to be vulnerable to oversmoothing due to their implicit homophily assumption. We mitigate this problem with a novel scheme that regulates the aggregation of messages. Our approach relies solely on learnt embeddings, obviating the need for auxiliary labels.
arXiv Detail & Related papers (2025-04-28T13:28:23Z)
Teaching MLPs to Master Heterogeneous Graph-Structured Knowledge for Efficient and Accurate Inference [53.38082028252104]
We introduce HG2M and HG2M+ to combine both HGNN's superior performance and relational's efficient inference. HG2M directly trains students with node features as input and soft labels from teacher HGNNs as targets. HG2Ms demonstrate a 379.24$times$ speedup in inference over HGNNs on the large-scale IGB-3M-19 dataset.
arXiv Detail & Related papers (2024-11-21T11:39:09Z)
Unlocking Potential Binders: Multimodal Pretraining DEL-Fusion for Denoising DNA-Encoded Libraries [51.72836644350993]
Multimodal Pretraining DEL-Fusion model (MPDF) We develop pretraining tasks applying contrastive objectives between different compound representations and their text descriptions. We propose a novel DEL-fusion framework that amalgamates compound information at the atomic, submolecular, and molecular levels.
arXiv Detail & Related papers (2024-09-07T17:32:21Z)
Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks. By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead. We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z)
ADMET property prediction through combinations of molecular fingerprints [0.0]
Random forests or support vector machines paired with extended-connectivity fingerprints consistently outperformed recently developed methods. A detailed investigation into regression algorithms and molecular fingerprints revealed gradient-boosted decision trees. We successfully validated our model across 22 Therapeutics Data Commons ADMET benchmarks.
arXiv Detail & Related papers (2023-09-29T22:39:18Z)
Boosting Convolution with Efficient MLP-Permutation for Volumetric Medical Image Segmentation [32.645022002807416]
Multi-layer perceptron (MLP) network has regained popularity among researchers due to their comparable results to ViT. We propose a novel permutable hybrid network for Vol-MedSeg, named PHNet, which capitalizes on the strengths of both convolution neural networks (CNNs) and PHNet.
arXiv Detail & Related papers (2023-03-23T08:59:09Z)
Modality-Agnostic Variational Compression of Implicit Neural Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR) Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism. After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z)
Pharmacoprint -- a combination of pharmacophore fingerprint and artificial intelligence as a tool for computer-aided drug design [6.053347262128918]
We propose a high-resolution, pharmacophore fingerprint called Pharmacoprint. It encodes the presence, types, and relationships between pharmacophore features of a molecule.
arXiv Detail & Related papers (2021-10-04T11:36:39Z)
A Systematic Approach to Featurization for Cancer Drug Sensitivity Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques. We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
DeepGS: Deep Representation Learning of Graphs and Sequences for Drug-Target Binding Affinity Prediction [8.292330541203647]
We propose a novel end-to-end learning framework, called DeepGS, which uses deep neural networks to extract the local chemical context from amino acids and SMILES sequences. We have conducted extensive experiments to compare our proposed method with state-of-the-art models including KronRLS, Sim, DeepDTA and DeepCPI.
arXiv Detail & Related papers (2020-03-31T01:35:39Z)
Adversarial Feature Hallucination Networks for Few-Shot Learning [84.31660118264514]
Adversarial Feature Hallucination Networks (AFHN) is based on conditional Wasserstein Generative Adversarial networks (cWGAN) Two novel regularizers are incorporated into AFHN to encourage discriminability and diversity of the synthesized features.
arXiv Detail & Related papers (2020-03-30T02:43:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.