A Scalable Tool For Analyzing Genomic Variants Of Humans Using Knowledge Graphs and Machine Learning
- URL: http://arxiv.org/abs/2407.20879v1
- Date: Tue, 30 Jul 2024 14:56:10 GMT
- Title: A Scalable Tool For Analyzing Genomic Variants Of Humans Using Knowledge Graphs and Machine Learning
- Authors: Shivika Prasanna, Ajay Kumar, Deepthi Rao, Eduardo Simoes, Praveen Rao,
- Abstract summary: We present a comprehensive approach for leveraging knowledge graphs and graph machine learning to analyze genomic variants.
The proposed method involves extracting variant-level genetic information, annotating the data with additional metadata using SnpEff, and converting the enriched Variant Call Format files into Resource Description Framework triples.
The resulting knowledge graph is further enhanced with patient metadata and stored in a graph database, facilitating efficient querying and indexing.
- Score: 7.928994572633366
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The integration of knowledge graphs and graph machine learning (GML) in genomic data analysis offers several opportunities for understanding complex genetic relationships, especially at the RNA level. We present a comprehensive approach for leveraging these technologies to analyze genomic variants, specifically in the context of RNA sequencing (RNA-seq) data from COVID-19 patient samples. The proposed method involves extracting variant-level genetic information, annotating the data with additional metadata using SnpEff, and converting the enriched Variant Call Format (VCF) files into Resource Description Framework (RDF) triples. The resulting knowledge graph is further enhanced with patient metadata and stored in a graph database, facilitating efficient querying and indexing. We utilize the Deep Graph Library (DGL) to perform graph machine learning tasks, including node classification with GraphSAGE and Graph Convolutional Networks (GCNs). Our approach demonstrates significant utility using our proposed tool, VariantKG, in three key scenarios: enriching graphs with new VCF data, creating subgraphs based on user-defined features, and conducting graph machine learning for node classification.
Related papers
- Semi-supervised Instruction Tuning for Large Language Models on Text-Attributed Graphs [62.544129365882014]
We propose a novel Semi-supervised Instruction Tuning pipeline for Graph Learning, named SIT-Graph.<n> SIT-Graph is model-agnostic and can be seamlessly integrated into any graph instruction tuning method that utilizes LLMs as the predictor.<n>Extensive experiments demonstrate that when incorporated into state-of-the-art graph instruction tuning methods, SIT-Graph significantly enhances their performance on text-attributed graph benchmarks.
arXiv Detail & Related papers (2026-01-19T08:10:53Z) - Improving Graph Embeddings in Machine Learning Using Knowledge Completion with Validation in a Case Study on COVID-19 Spread [1.0308647202215706]
Graph embeddings (GEs) map features from Knowledge Graphs (KGs) into vector spaces, enabling tasks like node classification and link prediction.<n>We propose a GML pipeline that integrates a Knowledge Completion phase to uncover latent dataset semantics before embedding generation.<n>Experiments show that our GML pipeline significantly alters the embedding space geometry, demonstrating that its introduction is not just a simple enrichment but a transformative step that redefines graph representation quality.
arXiv Detail & Related papers (2025-11-15T07:24:00Z) - Graph Alignment for Benchmarking Graph Neural Networks and Learning Positional Encodings [4.343110120255532]
We propose a novel benchmarking methodology for graph neural networks (GNNs) based on the graph alignment problem.<n>We frame this problem as a self-supervised learning task and present several methods to generate graph alignment datasets.<n>Our experiments indicate that anisotropic graph neural networks outperform standard convolutional architectures.
arXiv Detail & Related papers (2025-05-19T13:22:17Z) - Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)
This framework provides a standardized setting to evaluate GNNs across diverse datasets.
We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - Learning From Graph-Structured Data: Addressing Design Issues and Exploring Practical Applications in Graph Representation Learning [2.492884361833709]
We present an exhaustive review of the latest advancements in graph representation learning and Graph Neural Networks (GNNs)
GNNs, tailored to handle graph-structured data, excel in deriving insights and predictions from intricate relational information.
Our work delves into the capabilities of GNNs, examining their foundational designs and their application in addressing real-world challenges.
arXiv Detail & Related papers (2024-11-09T19:10:33Z) - RAGraph: A General Retrieval-Augmented Graph Learning Framework [35.25522856244149]
We introduce a novel framework called General Retrieval-Augmented Graph Learning (RAGraph)
RAGraph brings external graph data into the general graph foundation model to improve model generalization on unseen scenarios.
During inference, the RAGraph adeptly retrieves similar toy graphs based on key similarities in downstream tasks.
arXiv Detail & Related papers (2024-10-31T12:05:21Z) - Greener GRASS: Enhancing GNNs with Encoding, Rewiring, and Attention [12.409982249220812]
We introduce Graph Attention with Structures (GRASS), a novel GNN architecture, to enhance graph relative attention.
GRASS rewires the input graph by superimposing a random regular graph to achieve long-range information propagation.
It also employs a novel additive attention mechanism tailored for graph-structured data.
arXiv Detail & Related papers (2024-07-08T06:21:56Z) - Scalable Knowledge Graph Construction and Inference on Human Genome
Variants [2.8523023316864413]
Real-world knowledge can be represented as a graph consisting of entities and relationships between them.
In this work, variant-level information extracted from the RNA-sequences of vaccine-na"ive COVID-19 patients have been represented as a unified, large knowledge graph.
arXiv Detail & Related papers (2023-12-07T16:48:32Z) - GraphGLOW: Universal and Generalizable Structure Learning for Graph
Neural Networks [72.01829954658889]
This paper introduces the mathematical definition of this novel problem setting.
We devise a general framework that coordinates a single graph-shared structure learner and multiple graph-specific GNNs.
The well-trained structure learner can directly produce adaptive structures for unseen target graphs without any fine-tuning.
arXiv Detail & Related papers (2023-06-20T03:33:22Z) - Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit
Diversity Modeling [60.0185734837814]
Graph neural networks (GNNs) have found extensive applications in learning from graph data.
To bolster the generalization capacity of GNNs, it has become customary to augment training graph structures with techniques like graph augmentations.
This study introduces the concept of Mixture-of-Experts (MoE) to GNNs, with the aim of augmenting their capacity to adapt to a diverse range of training graph structures.
arXiv Detail & Related papers (2023-04-06T01:09:36Z) - SHGNN: Structure-Aware Heterogeneous Graph Neural Network [77.78459918119536]
This paper proposes a novel Structure-Aware Heterogeneous Graph Neural Network (SHGNN) to address the above limitations.
We first utilize a feature propagation module to capture the local structure information of intermediate nodes in the meta-path.
Next, we use a tree-attention aggregator to incorporate the graph structure information into the aggregation module on the meta-path.
Finally, we leverage a meta-path aggregator to fuse the information aggregated from different meta-paths.
arXiv Detail & Related papers (2021-12-12T14:18:18Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - Robust Optimization as Data Augmentation for Large-scale Graphs [117.2376815614148]
We propose FLAG (Free Large-scale Adversarial Augmentation on Graphs), which iteratively augments node features with gradient-based adversarial perturbations during training.
FLAG is a general-purpose approach for graph data, which universally works in node classification, link prediction, and graph classification tasks.
arXiv Detail & Related papers (2020-10-19T21:51:47Z) - GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems.
We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework.
We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.