Pseudodata-guided Invariant Representation Learning Boosts the Out-of-Distribution Generalization in Enzymatic Kinetic Parameter Prediction
- URL: http://arxiv.org/abs/2601.07261v1
- Date: Mon, 12 Jan 2026 07:03:07 GMT
- Title: Pseudodata-guided Invariant Representation Learning Boosts the Out-of-Distribution Generalization in Enzymatic Kinetic Parameter Prediction
- Authors: Haomin Wu, Zhiwei Nie, Hongyu Zhang, Zhixiang Ren,
- Abstract summary: O$2$DENet is a lightweight, plug-and-play module that enhances OOD generalization.<n>O$2$DENet introduces enzyme-substrate perturbations and enforces consistency between original and augmented enzyme-substrate-pair representations.
- Score: 12.238915133864046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate prediction of enzyme kinetic parameters is essential for understanding catalytic mechanisms and guiding enzyme engineering.However, existing deep learning-based enzyme-substrate interaction (ESI) predictors often exhibit performance degradation on sequence-divergent, out-of-distribution (OOD) cases, limiting robustness under biologically relevant perturbations.We propose O$^2$DENet, a lightweight, plug-and-play module that enhances OOD generalization via biologically and chemically informed perturbation augmentation and invariant representation learning.O$^2$DENet introduces enzyme-substrate perturbations and enforces consistency between original and augmented enzyme-substrate-pair representations to encourage invariance to distributional shifts.When integrated with representative ESI models, O$^2$DENet consistently improves predictive performance for both $k_{cat}$ and $K_m$ across stringent sequence-identity-based OOD benchmarks, achieving state-of-the-art results among the evaluated methods in terms of accuracy and robustness metrics.Overall, O$^2$DENet provides a general and effective strategy to enhance the stability and deployability of data-driven enzyme kinetics predictors for real-world enzyme engineering applications.
Related papers
- Persistent Sheaf Laplacian Analysis of Protein Stability and Solubility Changes upon Mutation [8.065100433006615]
SheafLapNet is a unified predictive framework grounded in the mathematical theory of Topological Deep Learning (TDL) and Persistent Sheaf Laplacian (PSL)<n>SheafLapNet synergizes sheaf-theoretic invariants with advanced protein transformer features and auxiliary physical descriptors to capture intrinsic molecular interactions in a multiscale and mechanistic manner.
arXiv Detail & Related papers (2026-01-18T01:45:12Z) - A Generalized Adaptive Joint Learning Framework for High-Dimensional Time-Varying Models [0.8594140167290097]
This article introduces Adaptive Joint Learning (AJL), a regularization framework designed to simultaneously perform functional variable selection and structural changepoint detection.<n>The analysis uncovers synchronized phase transitions in disease progression and identifies a parsimonious set of time-varying prognostic markers.
arXiv Detail & Related papers (2026-01-08T02:07:49Z) - Multimodal Regression for Enzyme Turnover Rates Prediction [57.60697333734054]
We propose a framework for predicting the enzyme turnover rate by integrating enzyme sequences, substrate structures, and environmental factors.<n>Our model combines a pre-trained language model and a convolutional neural network to extract features from protein sequences.<n>We leverage symbolic regression via Kolmogorov-Arnold Networks to explicitly learn mathematical formulas that govern the enzyme turnover rate.
arXiv Detail & Related papers (2025-09-15T11:07:26Z) - OmniESI: A unified framework for enzyme-substrate interaction prediction with progressive conditional deep learning [46.402707495664174]
We introduce a two-stage progressive framework, OmniESI, for enzyme-substrate interaction prediction through conditional deep learning.<n>We show that OmniESI consistently delivered superior performance than state-of-the-art specialized methods.<n>Overall, OmniESI represents a unified predictive approach for enzyme-substrate interactions.
arXiv Detail & Related papers (2025-06-22T09:40:40Z) - DISPROTBENCH: A Disorder-Aware, Task-Rich Benchmark for Evaluating Protein Structure Prediction in Realistic Biological Contexts [76.59606029593085]
DisProtBench is a benchmark for evaluating protein structure prediction models (PSPMs) under structural disorder and complex biological conditions.<n>DisProtBench spans three key axes: data complexity, task diversity, and Interpretability.<n>Results reveal significant variability in model robustness under disorder, with low-confidence regions linked to functional prediction failures.
arXiv Detail & Related papers (2025-06-18T23:58:22Z) - Exploring the Generalization Capabilities of AID-based Bi-level Optimization [50.3142765099442]
We present two types of bi-level optimization methods: approximate implicit differentiation (AID)-based and iterative differentiation (D)-based approaches.
AID-based methods cannot be easily transformed but must stay in the two-level structure.
We demonstrate the effectiveness and potential applications of these methods on real-world tasks.
arXiv Detail & Related papers (2024-11-25T04:22:17Z) - Source-Free Domain Adaptive Object Detection with Semantics Compensation [54.00183496587841]
We introduce Weak-to-strong Semantics Compensation (WSCo) for strong data augmentation.<n>WSCo compensates for the class-relevant semantics that may be lost during strong augmentation on the fly.<n>WSCo can be implemented as a generic plug-in, easily integrable with any existing SFOD pipelines.
arXiv Detail & Related papers (2024-10-07T23:32:06Z) - Adjoint Sensitivity Analysis on Multi-Scale Bioprocess Stochastic Reaction Network [2.6130735302655554]
We introduce an adjoint sensitivity approach to expedite the learning of mechanistic model parameters.
In this paper, we consider enzymatic analysis (SA) representing a multi-scale bioprocess mechanistic model.
arXiv Detail & Related papers (2024-05-07T05:06:45Z) - CASTLE: Regularization via Auxiliary Causal Graph Discovery [89.74800176981842]
We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables.
CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.
arXiv Detail & Related papers (2020-09-28T09:49:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.