Identifying Stress Responsive Genes using Overlapping Communities in
Co-expression Networks
- URL: http://arxiv.org/abs/2011.03526v2
- Date: Sat, 9 Apr 2022 17:22:53 GMT
- Title: Identifying Stress Responsive Genes using Overlapping Communities in
Co-expression Networks
- Authors: Camila Riccio, Jorge Finke, Camilo Rocha
- Abstract summary: The paper proposes a workflow to identify genes that respond to specific treatments in plants.
The workflow is applied to rice (Oryza sativa), a major food source known to be highly sensitive to salt stress.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This paper proposes a workflow to identify genes that respond to specific
treatments in plants. The workflow takes as input the RNA sequencing read
counts and phenotypical data of different genotypes, measured under control and
treatment conditions. It outputs a reduced group of genes marked as relevant
for treatment response. Technically, the proposed approach is both a
generalization and an extension of WGCNA. It aims to identify specific modules
of overlapping communities underlying the co-expression network of genes.
Module detection is achieved by using Hierarchical Link Clustering. The
overlapping nature of the systems' regulatory domains that generate
co-expression can be identified by such modules. LASSO regression is employed
to analyze phenotypic responses of modules to treatment.
Results. The workflow is applied to rice (Oryza sativa), a major food source
known to be highly sensitive to salt stress. The workflow identifies 19 rice
genes that seem relevant in the response to salt stress. They are distributed
across 6 modules: 3 modules, each grouping together 3 genes, are associated to
shoot K content; 2 modules of 3 genes are associated to shoot biomass; and 1
module of 4 genes is associated to root biomass. These genes represent target
genes for the improvement of salinity tolerance in rice.
Conclusion. A more effective framework to reduce the search-space for target
genes that respond to a specific treatment is introduced. It facilitates
experimental validation by restraining efforts to a smaller subset of genes of
high potential relevance.
Related papers
- Beyond Independent Genes: Learning Module-Inductive Representations for Gene Perturbation Prediction [48.80217316452559]
scBIG is a module-inductive prediction framework that explicitly models coordinated gene programs.<n> scBIG consistently outperforms state-of-the-art methods, particularly on unseen and perturbation settings.
arXiv Detail & Related papers (2026-02-03T16:43:40Z) - Tensor Network based Gene Regulatory Network Inference for Single-Cell Transcriptomic Data [0.0]
This study introduces a quantum-inspired framework leveraging tensor networks (TNs) to optimally map expression data.<n>We quantify gene dependencies and establish statistical significance via permutation testing.<n>By merging quantum physics inspired techniques with computational biology, our method provides novel insights into gene regulation.
arXiv Detail & Related papers (2025-09-08T17:11:12Z) - GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype [51.58774936662233]
Building gene regulatory networks (GRN) is essential to understand and predict the effects of genetic perturbations.<n>In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data.<n>We introduce gene biotype information for the first time in genetic perturbation, simulating the distinct roles of genes with different biotypes in regulating cellular processes.
arXiv Detail & Related papers (2025-05-06T03:35:24Z) - Regulatory DNA sequence Design with Reinforcement Learning [56.20290878358356]
We propose a generative approach that leverages reinforcement learning to fine-tune a pre-trained autoregressive model.
We evaluate our method on promoter design tasks in two yeast media conditions and enhancer design tasks for three human cell types.
arXiv Detail & Related papers (2025-03-11T02:33:33Z) - Learning to Discover Regulatory Elements for Gene Expression Prediction [59.470991831978516]
Seq2Exp is a Sequence to Expression network designed to discover and extract regulatory elements that drive target gene expression.
Our approach captures the causal relationship between epigenomic signals, DNA sequences and their associated regulatory elements.
arXiv Detail & Related papers (2025-02-19T03:25:49Z) - Survey and Improvement Strategies for Gene Prioritization with Large Language Models [61.24568051916653]
Large language models (LLMs) have performed well in medical exams, but their effectiveness in diagnosing rare genetic diseases has not been assessed.
We used multi-agent and Human Phenotype Ontology (HPO) classification to categorized patients based on phenotypes and solvability levels.
At baseline, GPT-4 outperformed other LLMs, achieving near 30% accuracy in ranking causal genes correctly.
arXiv Detail & Related papers (2025-01-30T23:03:03Z) - Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders [14.626706466908386]
Gene Regulatory Network Inference (GRNI) aims to identify causal relationships among genes using gene expression data.
Gene expression is influenced by latent confounders, such as non-coding RNAs, which add complexity to GRNI.
We propose GISL (Gene Regulatory Network Inference in the presence of Selection bias and Latent confounders) to infer true regulatory relationships in the presence of selection and confounding issues.
arXiv Detail & Related papers (2025-01-17T11:27:58Z) - Cross-Attention Graph Neural Networks for Inferring Gene Regulatory Networks with Skewed Degree Distribution [9.919024883502322]
Cross-Attention Complex Dual Graph Embedding Model (XATGRN)
Our model consistently outperforms existing state-of-the-art methods across various datasets.
arXiv Detail & Related papers (2024-12-18T10:56:40Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling [60.91599380893732]
VQDNA is a general-purpose framework that renovates genome tokenization from the perspective of genome vocabulary learning.
By leveraging vector-quantized codebooks as learnable vocabulary, VQDNA can adaptively tokenize genomes into pattern-aware embeddings.
arXiv Detail & Related papers (2024-05-13T20:15:03Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Gene Teams are on the Field: Evaluation of Variants in Gene-Networks
Using High Dimensional Modelling [0.0]
In medical genetics, each genetic variant is evaluated as an independent entity regarding its clinical importance.
In most complex diseases, variant combinations in specific gene networks, rather than the presence of a particular single variant, predominates.
We propose a high dimensional modelling based method to analyse all the variants in a gene network together.
arXiv Detail & Related papers (2023-01-27T15:02:23Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - A single-cell gene expression language model [2.9112649816695213]
We propose a machine learning system to learn context dependencies between genes.
Our model, Exceiver, is trained across a diversity of cell types using a self-supervised task.
We found agreement between the similarity profiles of latent sample representations and learned gene embeddings with respect to biological annotations.
arXiv Detail & Related papers (2022-10-25T20:52:19Z) - Gene Function Prediction with Gene Interaction Networks: A Context Graph
Kernel Approach [24.234645183601998]
We propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions.
In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs.
arXiv Detail & Related papers (2022-04-22T02:54:01Z) - SemanticCAP: Chromatin Accessibility Prediction Enhanced by Features
Learning from a Language Model [3.0643865202019698]
We propose a new solution named SemanticCAP to identify accessible regions of the genome.
It introduces a gene language model which models the context of gene sequences, thus being able to provide an effective representation of gene sequences.
Compared with other systems under public benchmarks, our model proved to have better performance.
arXiv Detail & Related papers (2022-04-05T11:47:58Z) - Multi-modal Self-supervised Pre-training for Regulatory Genome Across
Cell Types [75.65676405302105]
We propose a simple yet effective approach for pre-training genome data in a multi-modal and self-supervised manner, which we call GeneBERT.
We pre-train our model on the ATAC-seq dataset with 17 million genome sequences.
arXiv Detail & Related papers (2021-10-11T12:48:44Z) - SimpleChrome: Encoding of Combinatorial Effects for Predicting Gene
Expression [8.326669256957352]
We present SimpleChrome, a deep learning model that learns the histone modification representations of genes.
The features learned from the model allow us to better understand the latent effects of cross-gene interactions and direct gene regulation on the target gene expression.
arXiv Detail & Related papers (2020-12-15T23:30:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.