DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life
Science
- URL: http://arxiv.org/abs/2106.14232v1
- Date: Sun, 27 Jun 2021 13:27:47 GMT
- Title: DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life
Science
- Authors: Mufei Li, Jinjing Zhou, Jiajing Hu, Wenxuan Fan, Yangkang Zhang, Yaxin
Gu, George Karypis
- Abstract summary: We present DGL-LifeSci, an open-source package for deep learning on graphs in life science.
DGL-LifeSci is a python toolkit based on RDKit, PyTorch and Deep Graph Library.
It allows GNN-based modeling on custom datasets for molecular property prediction, reaction prediction and molecule generation.
- Score: 5.3825788156200565
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) constitute a class of deep learning methods for
graph data. They have wide applications in chemistry and biology, such as
molecular property prediction, reaction prediction and drug-target interaction
prediction. Despite the interest, GNN-based modeling is challenging as it
requires graph data pre-processing and modeling in addition to programming and
deep learning. Here we present DGL-LifeSci, an open-source package for deep
learning on graphs in life science. DGL-LifeSci is a python toolkit based on
RDKit, PyTorch and Deep Graph Library (DGL). DGL-LifeSci allows GNN-based
modeling on custom datasets for molecular property prediction, reaction
prediction and molecule generation. With its command-line interfaces, users can
perform modeling without any background in programming and deep learning. We
test the command-line interfaces using standard benchmarks MoleculeNet, USPTO,
and ZINC. Compared with previous implementations, DGL-LifeSci achieves a speed
up by up to 6x. For modeling flexibility, DGL-LifeSci provides well-optimized
modules for various stages of the modeling pipeline. In addition, DGL-LifeSci
provides pre-trained models for reproducing the test experiment results and
applying models without training. The code is distributed under an Apache-2.0
License and is freely accessible at https://github.com/awslabs/dgl-lifesci.
Related papers
- Continual Learning on Graphs: Challenges, Solutions, and Opportunities [72.7886669278433]
We provide a comprehensive review of existing continual graph learning (CGL) algorithms.
We compare methods with traditional continual learning techniques and analyze the applicability of the traditional continual learning techniques to forgetting tasks.
We will maintain an up-to-date repository featuring a comprehensive list of accessible algorithms.
arXiv Detail & Related papers (2024-02-18T12:24:45Z) - LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning [61.4707298969173]
We introduce LasTGL, an industrial framework that integrates unified and unified implementations of common temporal graph learning algorithms.
LasTGL provides comprehensive temporal graph datasets, TGNN models and utilities along with well-documented tutorials.
arXiv Detail & Related papers (2023-11-28T08:45:37Z) - Exploring the Potential of Large Language Models (LLMs) in Learning on
Graphs [59.74814230246034]
Large Language Models (LLMs) have been proven to possess extensive common knowledge and powerful semantic comprehension abilities.
We investigate two possible pipelines: LLMs-as-Enhancers and LLMs-as-Predictors.
arXiv Detail & Related papers (2023-07-07T05:31:31Z) - MolGraph: a Python package for the implementation of molecular graphs
and graph neural networks with TensorFlow and Keras [51.92255321684027]
MolGraph is a graph neural network (GNN) package for molecular machine learning (ML)
MolGraph implements a chemistry module to accommodate the generation of small molecular graphs, which can be passed to a GNN algorithm to solve a molecular ML problem.
GNNs proved useful for molecular identification and improved interpretability of chromatographic retention time data.
arXiv Detail & Related papers (2022-08-21T18:37:41Z) - Graph Generative Model for Benchmarking Graph Neural Networks [73.11514658000547]
We introduce a novel graph generative model that learns and reproduces the distribution of real-world graphs in a privacy-controlled way.
Our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.
arXiv Detail & Related papers (2022-07-10T06:42:02Z) - Learning Large-scale Subsurface Simulations with a Hybrid Graph Network
Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows.
HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure.
Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z) - Crystal Twins: Self-supervised Learning for Crystalline Material
Property Prediction [8.048439531116367]
We introduce Crystal Twins (CT): an SSL method for crystalline materials property prediction.
We pre-train a Graph Neural Network (GNN) by applying the redundancy reduction principle to the graph latent embeddings of augmented instances.
By sharing the pre-trained weights when fine-tuning the GNN for regression tasks, we significantly improve the performance for 7 challenging material property prediction benchmarks.
arXiv Detail & Related papers (2022-05-04T05:08:46Z) - Physics-Informed Graph Learning: A Survey [25.474725468416118]
We introduce a unified framework of graph learning models, and then examine existing PIGL methods in relation to the unified framework.
This survey paper is expected to stimulate innovative research and development activities pertaining to PIGL.
arXiv Detail & Related papers (2022-02-22T05:46:24Z) - AugLiChem: Data Augmentation Library of Chemical Structures for Machine
Learning [12.864696894234715]
AugLiChem is the data augmentation library for chemical structures.
Augmentation methods for both crystalline systems and molecules are introduced.
We show that using our augmentation strategies significantly improves the performance of ML models.
arXiv Detail & Related papers (2021-11-30T04:07:24Z) - DistDGL: Distributed Graph Neural Network Training for Billion-Scale
Graphs [22.63888380481248]
DistDGL is a system for training GNNs in a mini-batch fashion on a cluster of machines.
It is based on the Deep Graph Library (DGL), a popular GNN development framework.
Our results show that DistDGL achieves linear speedup without compromising model accuracy.
arXiv Detail & Related papers (2020-10-11T20:22:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.