Related papers: Machine and deep learning methods for predicting 3D genome organization

Machine and deep learning methods for predicting 3D genome organization

URL: http://arxiv.org/abs/2403.03231v1
Date: Mon, 4 Mar 2024 19:04:41 GMT
Title: Machine and deep learning methods for predicting 3D genome organization
Authors: Brydon P. G. Wall, My Nguyen, J. Chuck Harrell, Mikhail G. Dozmorov
Abstract summary: Three-Dimensional (3D) enhancer interactions play critical roles in a wide range of cellular processes by regulating gene expression. Machine learning methods have emerged as an alternative to obtain missing 3D interactions and/or improve resolution. In this review, we discuss computational tools for predicting three types of 3D interactions (EPIs, interactions, TAD boundaries) and analyze their pros and cons.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Three-Dimensional (3D) chromatin interactions, such as enhancer-promoter interactions (EPIs), loops, Topologically Associating Domains (TADs), and A/B compartments play critical roles in a wide range of cellular processes by regulating gene expression. Recent development of chromatin conformation capture technologies has enabled genome-wide profiling of various 3D structures, even with single cells. However, current catalogs of 3D structures remain incomplete and unreliable due to differences in technology, tools, and low data resolution. Machine learning methods have emerged as an alternative to obtain missing 3D interactions and/or improve resolution. Such methods frequently use genome annotation data (ChIP-seq, DNAse-seq, etc.), DNA sequencing information (k-mers, Transcription Factor Binding Site (TFBS) motifs), and other genomic properties to learn the associations between genomic features and chromatin interactions. In this review, we discuss computational tools for predicting three types of 3D interactions (EPIs, chromatin interactions, TAD boundaries) and analyze their pros and cons. We also point out obstacles of computational prediction of 3D interactions and suggest future research directions.

Related papers

Multimodal 3D Genome Pre-training [19.251471971427687]
We propose MIX-HIC, the first multimodal foundation model of 3D genome that integrates both 3D genome structure and epigenomic tracks. For accurate heterogeneous semantic fusion, we design the cross-modal interaction and mapping blocks for robust unified representation. We introduce the first large-scale dataset comprising over 1 million pairwise samples of Hi-C contact maps and epigenomic tracks for high-quality pre-training.
arXiv Detail & Related papers (2025-04-12T03:31:03Z)
GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters. Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks. It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z)
Boundary-Guided Learning for Gene Expression Prediction in Spatial Transcriptomics [7.763803040383128]
We propose a framework named BG-TRIPLEX, which leverages boundary information extracted from pathological images as guiding features to enhance gene expression prediction. Our framework consistently outperforms existing methods in terms of Pearson Correlation Coefficient (PCC) This method highlights the crucial role of boundary features in understanding the complex interactions between WSI and gene expression.
arXiv Detail & Related papers (2024-12-05T11:09:11Z)
Stacked ensemble\-based mutagenicity prediction model using multiple modalities with graph attention network [0.9736758288065405]
Mutagenicity is a concern due to its association with genetic mutations which can result in a variety of negative consequences. In this work, we introduce a novel stacked ensemble based mutagenicity prediction model.
arXiv Detail & Related papers (2024-09-03T09:14:21Z)
Protein binding affinity prediction under multiple substitutions applying eGNNs on Residue and Atomic graphs combined with Language model information: eGRAL [1.840390797252648]
Deep learning is increasingly recognized as a powerful tool capable of bridging the gap between in-silico predictions and in-vitro observations. We propose eGRAL, a novel graph neural network architecture designed for predicting binding affinity changes from amino acid substitutions in protein complexes. eGRAL leverages residue, atomic and evolutionary scales, thanks to features extracted from protein large language models.
arXiv Detail & Related papers (2024-05-03T10:33:19Z)
MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data [22.938437500266847]
We introduce a novel model called Multimodal Similarity Learning Graph Neural Network. It combines Multimodal Machine Learning and Deep Graph Neural Networks to learn gene representations from single-cell sequencing and spatial transcriptomic data. Our model efficiently produces unified gene representations for the analysis of gene functions, tissue functions, diseases, and species evolution.
arXiv Detail & Related papers (2023-09-29T13:33:53Z)
Genetic InfoMax: Exploring Mutual Information Maximization in High-Dimensional Imaging Genetics Studies [50.11449968854487]
Genome-wide association studies (GWAS) are used to identify relationships between genetic variations and specific traits. Representation learning for imaging genetics is largely under-explored due to the unique challenges posed by GWAS. We introduce a trans-modal learning framework Genetic InfoMax (GIM) to address the specific challenges of GWAS.
arXiv Detail & Related papers (2023-09-26T03:59:21Z)
Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT) It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures. Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z)
UNADON: Transformer-based model to predict genome-wide chromosome spatial position [2.3980064191633232]
We develop a new transformer-based deep learning model called UNADON. It predicts the genome-wide cytological distance to a specific type of nuclear body. It reveals potential sequence and epigenomic factors that affect large-scale compartmentalization to nuclear bodies.
arXiv Detail & Related papers (2023-04-26T01:30:50Z)
Unsupervised ensemble-based phenotyping helps enhance the discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles. It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner. These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z)
SemanticCAP: Chromatin Accessibility Prediction Enhanced by Features Learning from a Language Model [3.0643865202019698]
We propose a new solution named SemanticCAP to identify accessible regions of the genome. It introduces a gene language model which models the context of gene sequences, thus being able to provide an effective representation of gene sequences. Compared with other systems under public benchmarks, our model proved to have better performance.
arXiv Detail & Related papers (2022-04-05T11:47:58Z)
GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery. Existing generative models have several drawbacks including lack of modeling important molecular geometry elements. We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z)
TSGCNet: Discriminative Geometric Feature Learning with Two-Stream GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes. We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z)
ATOM3D: Tasks On Molecules in Three Dimensions [91.72138447636769]
Deep neural networks have recently gained significant attention. In this work we present ATOM3D, a collection of both novel and existing datasets spanning several key classes of biomolecules. We develop three-dimensional molecular learning networks for each of these tasks, finding that they consistently improve performance.
arXiv Detail & Related papers (2020-12-07T20:18:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.