MutFormer: A context-dependent transformer-based model to predict
pathogenic missense mutations
- URL: http://arxiv.org/abs/2110.14746v1
- Date: Wed, 27 Oct 2021 20:17:35 GMT
- Title: MutFormer: A context-dependent transformer-based model to predict
pathogenic missense mutations
- Authors: Theodore Jiang, Li Fang, Kai Wang
- Abstract summary: missense mutations account for approximately half of the known variants responsible for human inherited diseases.
Recent advances in deep learning show that transformer models are particularly powerful at modeling sequences.
We introduce MutFormer, a transformer-based model for prediction of pathogenic missense mutations.
- Score: 5.153619184788929
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A missense mutation is a point mutation that results in a substitution of an
amino acid in a protein sequence. Currently, missense mutations account for
approximately half of the known variants responsible for human inherited
diseases, but accurate prediction of the pathogenicity of missense variants is
still challenging. Recent advances in deep learning show that transformer
models are particularly powerful at modeling sequences. In this study, we
introduce MutFormer, a transformer-based model for prediction of pathogenic
missense mutations. We pre-trained MutFormer on reference protein sequences and
alternative protein sequences result from common genetic variants. We tested
different fine-tuning methods for pathogenicity prediction. Our results show
that MutFormer outperforms a variety of existing tools. MutFormer and
pre-computed variant scores are publicly available on GitHub at
https://github.com/WGLab/mutformer.
Related papers
- MutaPLM: Protein Language Modeling for Mutation Explanation and Engineering [12.738902517872509]
MutaPLM is a unified framework for interpreting and navigating protein mutations with protein language models.
MutaPLM introduces a protein delta network that captures explicit protein mutation representations within a unified feature space.
MutaPLM excels at providing human-understandable explanations for mutational effects and prioritizing novel mutations with desirable properties.
arXiv Detail & Related papers (2024-10-30T12:05:51Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - Learning to Predict Mutation Effects of Protein-Protein Interactions by Microenvironment-aware Hierarchical Prompt Learning [78.38442423223832]
We develop a novel codebook pre-training task, namely masked microenvironment modeling.
We demonstrate superior performance and training efficiency over state-of-the-art pre-training-based methods in mutation effect prediction.
arXiv Detail & Related papers (2024-05-16T03:53:21Z) - An Empirical Evaluation of Manually Created Equivalent Mutants [54.02049952279685]
Less than 10 % of manually created mutants are equivalent.
Surprisingly, our findings indicate that a significant portion of developers struggle to accurately identify equivalent mutants.
arXiv Detail & Related papers (2024-04-14T13:04:10Z) - Predicting loss-of-function impact of genetic mutations: a machine
learning approach [0.0]
This paper aims to train machine learning models on the attributes of a genetic mutation to predict LoFtool scores.
These attributes included, but were not limited to, the position of a mutation on a chromosome, changes in amino acids, and changes in codons caused by the mutation.
Models were evaluated using five-fold cross-validated averages of r-squared, mean squared error, root mean squared error, mean absolute error, and explained variance.
arXiv Detail & Related papers (2024-01-26T19:27:38Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - Diversity-Measurable Anomaly Detection [106.07413438216416]
We propose Diversity-Measurable Anomaly Detection (DMAD) framework to enhance reconstruction diversity.
PDM essentially decouples deformation from embedding and makes the final anomaly score more reliable.
arXiv Detail & Related papers (2023-03-09T05:52:42Z) - InForecaster: Forecasting Influenza Hemagglutinin Mutations Through the
Lens of Anomaly Detection [3.5213888068272197]
anomaly detection (AD) is a well-established field in Machine Learning (ML)
We propose to tackle this challenge through anomaly detection (AD)
We conduct a large number of experiments on four publicly available datasets.
arXiv Detail & Related papers (2022-10-25T02:08:09Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - PhyloTransformer: A Discriminative Model for Mutation Prediction Based
on a Multi-head Self-attention Mechanism [10.468453827172477]
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused an ongoing pandemic infecting 219 million people as of 10/19/21, with a 3.6% mortality rate.
Here we developed PhyloTransformer, a Transformer-based discriminative model that engages a multi-head self-attention mechanism to model genetic mutations that may lead to viral reproductive advantage.
arXiv Detail & Related papers (2021-11-03T01:30:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.