Related papers: Deep Contextual Learners for Protein Networks

Deep Contextual Learners for Protein Networks

URL: http://arxiv.org/abs/2106.02246v1
Date: Fri, 4 Jun 2021 04:26:27 GMT
Title: Deep Contextual Learners for Protein Networks
Authors: Michelle M. Li, Marinka Zitnik
Abstract summary: We introduce AWARE, a graph neural message passing approach to inject cellular and tissue context into protein embeddings. AWARE learns protein, cell type, and tissue embeddings that uphold cell type and tissue hierarchies. We demonstrate AWARE on the novel task of predicting whether a gene is associated with a disease and where it most likely manifests in the human body.
Score: 16.599890339599586
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spatial context is central to understanding health and disease. Yet reference protein interaction networks lack such contextualization, thereby limiting the study of where protein interactions likely occur in the human body. Contextualized protein interactions could better characterize genes with disease-specific interactions and elucidate diseases' manifestation in specific cell types. Here, we introduce AWARE, a graph neural message passing approach to inject cellular and tissue context into protein embeddings. AWARE optimizes for a multi-scale embedding space, whose structure reflects the topology of cell type specific networks. We construct a multi-scale network of the Human Cell Atlas and apply AWARE to learn protein, cell type, and tissue embeddings that uphold cell type and tissue hierarchies. We demonstrate AWARE on the novel task of predicting whether a gene is associated with a disease and where it most likely manifests in the human body. AWARE embeddings outperform global embeddings by at least 12.5%, highlighting the importance of contextual learners for protein networks.

Related papers

On Quantum Random Walks in Biomolecular Networks [0.0]
Biomolecular networks offer valuable insights into the organization of biological systems.<n>These networks are key to understanding cellular functions, disease mechanisms, and identifying therapeutic targets.<n>We explore the potential of quantum random walks (QRWs) for biomolecular network analysis.
arXiv Detail & Related papers (2025-06-06T20:25:52Z)
Strategic priorities for transformative progress in advancing biology with proteomics and artificial intelligence [54.14779179869007]
We highlight key areas where AI is driving innovation, from data analysis to new biological insights. These include developing an AI-friendly ecosystem for data generation, sharing, and analysis.
arXiv Detail & Related papers (2025-02-21T13:20:33Z)
Long-context Protein Language Model [76.95505296417866]
Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design. Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths. We propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built off selective structured state-space models. We also introduce its graph-contextual variant, LC-PLM-G, which contextualizes protein-protein interaction graphs for a second stage of training.
arXiv Detail & Related papers (2024-10-29T16:43:28Z)
ProLLM: Protein Chain-of-Thoughts Enhanced LLM for Protein-Protein Interaction Prediction [54.132290875513405]
The prediction of protein-protein interactions (PPIs) is crucial for understanding biological functions and diseases. Previous machine learning approaches to PPI prediction mainly focus on direct physical interactions. We propose a novel framework ProLLM that employs an LLM tailored for PPI for the first time.
arXiv Detail & Related papers (2024-03-30T05:32:42Z)
NaNa and MiGu: Semantic Data Augmentation Techniques to Enhance Protein Classification in Graph Neural Networks [60.48306899271866]
We propose novel semantic data augmentation methods to incorporate backbone chemical and side-chain biophysical information into protein classification tasks. Specifically, we leverage molecular biophysical, secondary structure, chemical bonds, andionic features of proteins to facilitate classification tasks.
arXiv Detail & Related papers (2024-03-21T13:27:57Z)
Single-Cell Deep Clustering Method Assisted by Exogenous Gene Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells. During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation. This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z)
Zyxin is all you need: machine learning adherent cell mechanics [0.0]
We develop a data-driven biophysical modeling approach to learn the mechanical behavior of adherent cells. We first train neural networks to predict forces generated by adherent cells from images of cytoskeletal proteins. We next develop two approaches - one explicitly constrained by physics, the other more continuum - that help construct data-driven models of cellular forces.
arXiv Detail & Related papers (2023-03-01T02:08:40Z)
Integration of Pre-trained Protein Language Models into Geometric Deep Learning Networks [68.90692290665648]
We integrate knowledge learned by protein language models into several state-of-the-art geometric networks. Our findings show an overall improvement of 20% over baselines. Strong evidence indicates that the incorporation of protein language models' knowledge enhances geometric networks' capacity by a significant margin.
arXiv Detail & Related papers (2022-12-07T04:04:04Z)
Subcellular Protein Localisation in the Human Protein Atlas using Ensembles of Diverse Deep Architectures [11.41081495236219]
Automated visual localisation of subcellular proteins can accelerate our understanding of cell function in health and disease. We show how this gap can be narrowed by addressing three key aspects: (i) automated improvement of cell annotation quality, (ii) new Convolutional Neural Network (CNN) architectures supporting unbalanced and noisy data, and (iii) informed selection and fusion of multiple & diverse machine learning models.
arXiv Detail & Related papers (2022-05-19T20:28:56Z)
Global Mapping of Gene/Protein Interactions in PubMed Abstracts: A Framework and an Experiment with P53 Interactions [7.361249273831739]
The large body of biomedical literature is an important source of gene/protein interaction information. Recent advances in text mining tools have made it possible to automatically extract such documented interactions from free-text literature. We propose a comprehensive framework for constructing and analyzing large-scale gene functional networks based on the gene/protein interactions extracted from biomedical literature repositories.
arXiv Detail & Related papers (2022-04-22T03:04:19Z)
A multitask transfer learning framework for the prediction of virus-human protein-protein interactions [0.30586855806896046]
We develop a transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome. We employ an additional objective which aims to maximize the probability of observing human protein-protein interactions. Experimental results show that our proposed model works effectively for both virus-human and bacteria-human protein-protein interaction prediction tasks.
arXiv Detail & Related papers (2021-11-26T07:53:51Z)
Bio-JOIE: Joint Representation Learning of Biological Knowledge Bases [38.9571812880758]
We show that Bio-JOIE can accurately identify PPIs between the SARS-CoV-2 proteins and human proteins. By leveraging only structured knowledge, Bio-JOIE significantly outperforms existing state-of-the-art methods in PPI type prediction on multiple species.
arXiv Detail & Related papers (2021-03-07T07:06:53Z)
BERTology Meets Biology: Interpreting Attention in Protein Language Models [124.8966298974842]
We demonstrate methods for analyzing protein Transformer models through the lens of attention. We show that attention captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure. We also present a three-dimensional visualization of the interaction between attention and protein structure.
arXiv Detail & Related papers (2020-06-26T21:50:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.