Related papers: DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science

DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science

URL: http://arxiv.org/abs/2106.14232v1
Date: Sun, 27 Jun 2021 13:27:47 GMT
Title: DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science
Authors: Mufei Li, Jinjing Zhou, Jiajing Hu, Wenxuan Fan, Yangkang Zhang, Yaxin Gu, George Karypis
Abstract summary: We present DGL-LifeSci, an open-source package for deep learning on graphs in life science. DGL-LifeSci is a python toolkit based on RDKit, PyTorch and Deep Graph Library. It allows GNN-based modeling on custom datasets for molecular property prediction, reaction prediction and molecule generation.
Score: 5.3825788156200565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph neural networks (GNNs) constitute a class of deep learning methods for graph data. They have wide applications in chemistry and biology, such as molecular property prediction, reaction prediction and drug-target interaction prediction. Despite the interest, GNN-based modeling is challenging as it requires graph data pre-processing and modeling in addition to programming and deep learning. Here we present DGL-LifeSci, an open-source package for deep learning on graphs in life science. DGL-LifeSci is a python toolkit based on RDKit, PyTorch and Deep Graph Library (DGL). DGL-LifeSci allows GNN-based modeling on custom datasets for molecular property prediction, reaction prediction and molecule generation. With its command-line interfaces, users can perform modeling without any background in programming and deep learning. We test the command-line interfaces using standard benchmarks MoleculeNet, USPTO, and ZINC. Compared with previous implementations, DGL-LifeSci achieves a speed up by up to 6x. For modeling flexibility, DGL-LifeSci provides well-optimized modules for various stages of the modeling pipeline. In addition, DGL-LifeSci provides pre-trained models for reproducing the test experiment results and applying models without training. The code is distributed under an Apache-2.0 License and is freely accessible at https://github.com/awslabs/dgl-lifesci.

Related papers

GiGL: Large-Scale Graph Neural Networks at Snapchat [32.1186726452899]
We present GiGL (Gigantic Graph Learning), an open-source library to enable large-scale distributed graph ML. We use GiGL internally at Snapchat to manage the heavy lifting of GNN, including graph data preprocessing from relational DBs. GiGL is used in multiple production settings, and has powered over 35 launches across multiple business domains in the last 2 years.
arXiv Detail & Related papers (2025-02-20T21:29:17Z)
PyG-SSL: A Graph Self-Supervised Learning Toolkit [71.22547762704602]
Graph Self-Supervised Learning (SSL) has emerged as a pivotal area of research in recent years. Despite the remarkable achievements of these graph SSL methods, their current implementation poses significant challenges for beginners. We present a Graph SSL toolkit named PyG-SSL, which is built upon PyTorch and is compatible with various deep learning and scientific computing backends.
arXiv Detail & Related papers (2024-12-30T18:32:05Z)
Continual Learning on Graphs: Challenges, Solutions, and Opportunities [72.7886669278433]
We provide a comprehensive review of existing continual graph learning (CGL) algorithms. We compare methods with traditional continual learning techniques and analyze the applicability of the traditional continual learning techniques to forgetting tasks. We will maintain an up-to-date repository featuring a comprehensive list of accessible algorithms.
arXiv Detail & Related papers (2024-02-18T12:24:45Z)
LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning [61.4707298969173]
We introduce LasTGL, an industrial framework that integrates unified and unified implementations of common temporal graph learning algorithms. LasTGL provides comprehensive temporal graph datasets, TGNN models and utilities along with well-documented tutorials.
arXiv Detail & Related papers (2023-11-28T08:45:37Z)
Exploring the Potential of Large Language Models (LLMs) in Learning on Graphs [59.74814230246034]
Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities. We investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors.
arXiv Detail & Related papers (2023-07-07T05:31:31Z)
MolGraph: a Python package for the implementation of molecular graphs and graph neural networks with TensorFlow and Keras [51.92255321684027]
MolGraph is a graph neural network (GNN) package for molecular machine learning (ML) MolGraph implements a chemistry module to accommodate the generation of small molecular graphs, which can be passed to a GNN algorithm to solve a molecular ML problem. GNNs proved useful for molecular identification and improved interpretability of chromatographic retention time data.
arXiv Detail & Related papers (2022-08-21T18:37:41Z)
Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way. Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z)
Learning Large-scale Subsurface Simulations with a Hybrid Graph Network Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows. HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure. Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z)
Crystal Twins: Self-supervised Learning for Crystalline Material Property Prediction [8.048439531116367]
We introduce Crystal Twins (CT): an SSL method for crystalline materials property prediction. We pre-train a Graph Neural Network (GNN) by applying the redundancy reduction principle to the graph latent embeddings of augmented instances. By sharing the pre-trained weights when fine-tuning the GNN for regression tasks, we significantly improve the performance for 7 challenging material property prediction benchmarks.
arXiv Detail & Related papers (2022-05-04T05:08:46Z)
Physics-Informed Graph Learning: A Survey [25.474725468416118]
We introduce a unified framework of graph learning models, and then examine existing PIGL methods in relation to the unified framework. This survey paper is expected to stimulate innovative research and development activities pertaining to PIGL.
arXiv Detail & Related papers (2022-02-22T05:46:24Z)
AugLiChem: Data Augmentation Library of Chemical Structures for Machine Learning [12.864696894234715]
AugLiChem is the data augmentation library for chemical structures. Augmentation methods for both crystalline systems and molecules are introduced. We show that using our augmentation strategies significantly improves the performance of ML models.
arXiv Detail & Related papers (2021-11-30T04:07:24Z)
DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs [22.63888380481248]
DistDGL is a system for training GNNs in a mini-batch fashion on a cluster of machines. It is based on the Deep Graph Library (DGL), a popular GNN development framework. Our results show that DistDGL achieves linear speedup without compromising model accuracy.
arXiv Detail & Related papers (2020-10-11T20:22:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.