PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
- URL: http://arxiv.org/abs/2507.05101v1
- Date: Mon, 07 Jul 2025 15:21:05 GMT
- Title: PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
- Authors: Xinzhe Zheng, Hao Du, Fanding Xu, Jinzhe Li, Zhiyuan Liu, Wenkang Wang, Tao Chen, Wanli Ouyang, Stan Z. Li, Yan Lu, Nanqing Dong, Yang Zhang,
- Abstract summary: PRING is the first benchmark that evaluates protein-protein interaction prediction from a graph-level perspective.<n> PRING curates a high-quality, multi-species PPI network dataset comprising 21,484 proteins and 186,818 interactions.
- Score: 80.08310253195144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning-based computational methods have achieved promising results in predicting protein-protein interactions (PPIs). However, existing benchmarks predominantly focus on isolated pairwise evaluations, overlooking a model's capability to reconstruct biologically meaningful PPI networks, which is crucial for biology research. To address this gap, we introduce PRING, the first comprehensive benchmark that evaluates protein-protein interaction prediction from a graph-level perspective. PRING curates a high-quality, multi-species PPI network dataset comprising 21,484 proteins and 186,818 interactions, with well-designed strategies to address both data redundancy and leakage. Building on this golden-standard dataset, we establish two complementary evaluation paradigms: (1) topology-oriented tasks, which assess intra and cross-species PPI network construction, and (2) function-oriented tasks, including protein complex pathway prediction, GO module analysis, and essential protein justification. These evaluations not only reflect the model's capability to understand the network topology but also facilitate protein function annotation, biological module detection, and even disease mechanism analysis. Extensive experiments on four representative model categories, consisting of sequence similarity-based, naive sequence-based, protein language model-based, and structure-based approaches, demonstrate that current PPI models have potential limitations in recovering both structural and functional properties of PPI networks, highlighting the gap in supporting real-world biological applications. We believe PRING provides a reliable platform to guide the development of more effective PPI prediction models for the community. The dataset and source code of PRING are available at https://github.com/SophieSarceau/PRING.
Related papers
- Hierarchical Multi-Label Contrastive Learning for Protein-Protein Interaction Prediction Across Organisms [2.399426243085768]
We propose HIPPO, a hierarchical contrastive framework for protein-protein interaction prediction.<n>The proposed approach incorporates hierarchical contrastive loss functions that emulate the structured relationship among functional classes of proteins.<n> Experiments on benchmark datasets demonstrate that HIPPO achieves state-of-the-art performance, outperforming existing methods and showing robustness in low-data regimes.
arXiv Detail & Related papers (2025-07-03T15:41:04Z) - DISPROTBENCH: A Disorder-Aware, Task-Rich Benchmark for Evaluating Protein Structure Prediction in Realistic Biological Contexts [76.59606029593085]
DisProtBench is a benchmark for evaluating protein structure prediction models (PSPMs) under structural disorder and complex biological conditions.<n>DisProtBench spans three key axes: data complexity, task diversity, and Interpretability.<n>Results reveal significant variability in model robustness under disorder, with low-confidence regions linked to functional prediction failures.
arXiv Detail & Related papers (2025-06-18T23:58:22Z) - KEPLA: A Knowledge-Enhanced Deep Learning Framework for Accurate Protein-Ligand Binding Affinity Prediction [60.23701115249195]
KEPLA is a novel deep learning framework that integrates prior knowledge from Gene Ontology and ligand properties to enhance prediction performance.<n> Experiments on two benchmark datasets demonstrate that KEPLA consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-06-16T08:02:42Z) - Beyond Simple Concatenation: Fairly Assessing PLM Architectures for Multi-Chain Protein-Protein Interactions Prediction [0.2509487459755192]
Protein-protein interactions (PPIs) are fundamental to numerous cellular processes.<n>PLMs have demonstrated remarkable success in predicting protein structure and function.<n>Their application to sequence-based PPI binding affinity prediction remains relatively underexplored.
arXiv Detail & Related papers (2025-05-26T14:23:08Z) - Joint Masked Reconstruction and Contrastive Learning for Mining Interactions Between Proteins [4.254824555546419]
Protein-protein interaction (PPI) prediction is an instrumental means in elucidating the mechanisms underlying cellular operations.<n>This paper introduces a novel PPI prediction method jointing masked reconstruction and contrastive learning, termed JmcPPI.<n>Extensive experiments conducted on three widely utilized PPI datasets demonstrate that JmcPPI surpasses existing optimal baseline models.
arXiv Detail & Related papers (2025-03-06T17:39:12Z) - The Signed Two-Space Proximity Model for Learning Representations in Protein-Protein Interaction Networks [16.396309363020908]
Accurately predicting complex protein-protein interactions (PPIs) is crucial for decoding biological processes.<n>We present the Signed Two-Space Proximity Model (S2-SPM) for signed PPI networks, which explicitly incorporates both types of interactions.<n>Our approach also enables the identification of archetypes representing extreme protein profiles.
arXiv Detail & Related papers (2025-03-05T21:08:58Z) - MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction [65.33218256339151]
Post-translational modifications (PTMs) profoundly expand the complexity and functionality of the proteome.
Existing computational approaches predominantly focus on protein sequences to predict PTM sites, driven by the recognition of sequence-dependent motifs.
We introduce the MeToken model, which tokenizes the micro-environment of each acid, integrating both sequence and structural information into unified discrete tokens.
arXiv Detail & Related papers (2024-11-04T07:14:28Z) - SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - CoGANPPIS: A Coevolution-enhanced Global Attention Neural Network for
Protein-Protein Interaction Site Prediction [0.9217021281095907]
We propose a coevolution-enhanced global attention neural network, a sequence-based deep learning model for PPIs prediction.
CoGANPPIS utilizes three layers in parallel for feature extraction.
Our proposed model achieves the state-of-the-art performance.
arXiv Detail & Related papers (2023-03-13T09:27:34Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Structure-aware Interactive Graph Neural Networks for the Prediction of
Protein-Ligand Binding Affinity [52.67037774136973]
Drug discovery often relies on the successful prediction of protein-ligand binding affinity.
Recent advances have shown great promise in applying graph neural networks (GNNs) for better affinity prediction by learning the representations of protein-ligand complexes.
We propose a structure-aware interactive graph neural network (SIGN) which consists of two components: polar-inspired graph attention layers (PGAL) and pairwise interactive pooling (PiPool)
arXiv Detail & Related papers (2021-07-21T03:34:09Z) - PersGNN: Applying Topological Data Analysis and Geometric Deep Learning
to Structure-Based Protein Function Prediction [0.07340017786387766]
In this work, we isolate protein structure to make functional annotations for proteins in the Protein Data Bank.
We present PersGNN - an end-to-end trainable deep learning model that combines graph representation learning with topological data analysis.
arXiv Detail & Related papers (2020-10-30T02:24:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.