A Supervised Machine Learning Approach for Sequence Based
Protein-protein Interaction (PPI) Prediction
- URL: http://arxiv.org/abs/2203.12659v1
- Date: Wed, 23 Mar 2022 18:27:25 GMT
- Title: A Supervised Machine Learning Approach for Sequence Based
Protein-protein Interaction (PPI) Prediction
- Authors: Soumyadeep Debnath and Ayatullah Faruk Mollah
- Abstract summary: Computational protein-protein interaction (PPI) prediction techniques can contribute greatly in reducing time, cost and false-positive interactions.
We have described our submitted solution with the results of the SeqPIP competition.
- Score: 4.916874464940376
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Computational protein-protein interaction (PPI) prediction techniques can
contribute greatly in reducing time, cost and false-positive interactions
compared to experimental approaches. Sequence is one of the key and primary
information of proteins that plays a crucial role in PPI prediction. Several
machine learning approaches have been applied to exploit the characteristics of
PPI datasets. However, these datasets greatly influence the performance of
predicting models. So, care should be taken on both dataset curation as well as
design of predictive models. Here, we have described our submitted solution
with the results of the SeqPIP competition whose objective was to develop
comprehensive PPI predictive models from sequence information with high-quality
bias-free interaction datasets. A training set of 2000 positive and 2000
negative interactions with sequences was given to us. Our method was evaluated
with three independent high-quality interaction test datasets and with other
competitors solutions.
Related papers
- Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - PSC-CPI: Multi-Scale Protein Sequence-Structure Contrasting for
Efficient and Generalizable Compound-Protein Interaction Prediction [63.50967073653953]
Compound-Protein Interaction prediction aims to predict the pattern and strength of compound-protein interactions for rational drug discovery.
Existing deep learning-based methods utilize only the single modality of protein sequences or structures.
We propose a novel multi-scale Protein Sequence-structure Contrasting framework for CPI prediction.
arXiv Detail & Related papers (2024-02-13T03:51:10Z) - Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals [91.59906995214209]
We propose a new evaluation method, Counterfactual Attentiveness Test (CAT)
CAT uses counterfactuals by replacing part of the input with its counterpart from a different example, expecting an attentive model to change its prediction.
We show that GPT3 becomes less attentive with an increased number of demonstrations, while its accuracy on the test data improves.
arXiv Detail & Related papers (2023-11-16T06:27:35Z) - Improved K-mer Based Prediction of Protein-Protein Interactions With
Chaos Game Representation, Deep Learning and Reduced Representation Bias [0.0]
We present a method for extracting unique pairs from an interaction dataset, generating non-redundant paired data for unbiased machine learning.
We develop a convolutional neural network model capable of learning and predicting interactions from Chaos Game Representations of proteins' coding genes.
arXiv Detail & Related papers (2023-10-23T10:02:23Z) - SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Efficient
and Generalizable Protein-Protein Interaction Prediction [16.203794286288815]
Protein-protein interactions (PPIs) are crucial in various biological processes and their study has significant implications for drug development and disease diagnosis.
Existing deep learning methods suffer from significant performance degradation under complex real-world scenarios.
We propose a self-ensembling multigraph neural network (SemiGNN-PPI) that can effectively predict PPIs while being both efficient and generalizable.
arXiv Detail & Related papers (2023-05-15T03:06:44Z) - Prediction-Powered Inference [68.97619568620709]
Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system.
The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients.
Prediction-powered inference could enable researchers to draw valid and more data-efficient conclusions using machine learning.
arXiv Detail & Related papers (2023-01-23T18:59:28Z) - Insights into performance evaluation of com-pound-protein interaction
prediction methods [0.0]
Machine learning based prediction of compound-protein interactions (CPIs) is important for drug design, screening and repurposing studies.
We have observed a number of fundamental issues in experiment design that lead to over optimistic estimates of model performance.
arXiv Detail & Related papers (2022-01-28T20:07:19Z) - Learning Unknown from Correlations: Graph Neural Network for
Inter-novel-protein Interaction Prediction [7.860159889216291]
Existing methods suffer from significant performance degradation when tested in unseen dataset.
We propose a graph neural network based method (GNN-PPI) for better inter-novel-protein interaction prediction.
arXiv Detail & Related papers (2021-05-14T08:42:55Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - HINT: Hierarchical Interaction Network for Trial Outcome Prediction
Leveraging Web Data [56.53715632642495]
Clinical trials face uncertain outcomes due to issues with efficacy, safety, or problems with patient recruitment.
In this paper, we propose Hierarchical INteraction Network (HINT) for more general, clinical trial outcome predictions.
arXiv Detail & Related papers (2021-02-08T15:09:07Z) - Bayesian neural network with pretrained protein embedding enhances
prediction accuracy of drug-protein interaction [3.499870393443268]
Deep learning approaches can predict drug-protein interactions without trial-and-error by humans.
We propose two methods to construct a deep learning framework that exhibits superior performance with a small labeled dataset.
arXiv Detail & Related papers (2020-12-15T10:24:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.