Related papers: Pseudodata-guided Invariant Representation Learning Boosts the Out-of-Distribution Generalization in Enzymatic Kinetic Parameter Prediction

Pseudodata-guided Invariant Representation Learning Boosts the Out-of-Distribution Generalization in Enzymatic Kinetic Parameter Prediction

URL: http://arxiv.org/abs/2601.07261v1
Date: Mon, 12 Jan 2026 07:03:07 GMT
Title: Pseudodata-guided Invariant Representation Learning Boosts the Out-of-Distribution Generalization in Enzymatic Kinetic Parameter Prediction
Authors: Haomin Wu, Zhiwei Nie, Hongyu Zhang, Zhixiang Ren,
Abstract summary: O$2$DENet is a lightweight, plug-and-play module that enhances OOD generalization.<n>O$2$DENet introduces enzyme-substrate perturbations and enforces consistency between original and augmented enzyme-substrate-pair representations.
Score: 12.238915133864046
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate prediction of enzyme kinetic parameters is essential for understanding catalytic mechanisms and guiding enzyme engineering.However, existing deep learning-based enzyme-substrate interaction (ESI) predictors often exhibit performance degradation on sequence-divergent, out-of-distribution (OOD) cases, limiting robustness under biologically relevant perturbations.We propose O$^2$DENet, a lightweight, plug-and-play module that enhances OOD generalization via biologically and chemically informed perturbation augmentation and invariant representation learning.O$^2$DENet introduces enzyme-substrate perturbations and enforces consistency between original and augmented enzyme-substrate-pair representations to encourage invariance to distributional shifts.When integrated with representative ESI models, O$^2$DENet consistently improves predictive performance for both $k_{cat}$ and $K_m$ across stringent sequence-identity-based OOD benchmarks, achieving state-of-the-art results among the evaluated methods in terms of accuracy and robustness metrics.Overall, O$^2$DENet provides a general and effective strategy to enhance the stability and deployability of data-driven enzyme kinetics predictors for real-world enzyme engineering applications.

Related papers

Persistent Sheaf Laplacian Analysis of Protein Stability and Solubility Changes upon Mutation [8.065100433006615]
SheafLapNet is a unified predictive framework grounded in the mathematical theory of Topological Deep Learning (TDL) and Persistent Sheaf Laplacian (PSL)<n>SheafLapNet synergizes sheaf-theoretic invariants with advanced protein transformer features and auxiliary physical descriptors to capture intrinsic molecular interactions in a multiscale and mechanistic manner.
arXiv Detail & Related papers (2026-01-18T01:45:12Z)
A Generalized Adaptive Joint Learning Framework for High-Dimensional Time-Varying Models [0.8594140167290097]
This article introduces Adaptive Joint Learning (AJL), a regularization framework designed to simultaneously perform functional variable selection and structural changepoint detection.<n>The analysis uncovers synchronized phase transitions in disease progression and identifies a parsimonious set of time-varying prognostic markers.
arXiv Detail & Related papers (2026-01-08T02:07:49Z)
Multimodal Regression for Enzyme Turnover Rates Prediction [57.60697333734054]
We propose a framework for predicting the enzyme turnover rate by integrating enzyme sequences, substrate structures, and environmental factors.<n>Our model combines a pre-trained language model and a convolutional neural network to extract features from protein sequences.<n>We leverage symbolic regression via Kolmogorov-Arnold Networks to explicitly learn mathematical formulas that govern the enzyme turnover rate.
arXiv Detail & Related papers (2025-09-15T11:07:26Z)
OmniESI: A unified framework for enzyme-substrate interaction prediction with progressive conditional deep learning [46.402707495664174]
We introduce a two-stage progressive framework, OmniESI, for enzyme-substrate interaction prediction through conditional deep learning.<n>We show that OmniESI consistently delivered superior performance than state-of-the-art specialized methods.<n>Overall, OmniESI represents a unified predictive approach for enzyme-substrate interactions.
arXiv Detail & Related papers (2025-06-22T09:40:40Z)
DISPROTBENCH: A Disorder-Aware, Task-Rich Benchmark for Evaluating Protein Structure Prediction in Realistic Biological Contexts [76.59606029593085]
DisProtBench is a benchmark for evaluating protein structure prediction models (PSPMs) under structural disorder and complex biological conditions.<n>DisProtBench spans three key axes: data complexity, task diversity, and Interpretability.<n>Results reveal significant variability in model robustness under disorder, with low-confidence regions linked to functional prediction failures.
arXiv Detail & Related papers (2025-06-18T23:58:22Z)
Exploring the Generalization Capabilities of AID-based Bi-level Optimization [50.3142765099442]
We present two types of bi-level optimization methods: approximate implicit differentiation (AID)-based and iterative differentiation (D)-based approaches. AID-based methods cannot be easily transformed but must stay in the two-level structure. We demonstrate the effectiveness and potential applications of these methods on real-world tasks.
arXiv Detail & Related papers (2024-11-25T04:22:17Z)
Source-Free Domain Adaptive Object Detection with Semantics Compensation [54.00183496587841]
We introduce Weak-to-strong Semantics Compensation (WSCo) for strong data augmentation.<n>WSCo compensates for the class-relevant semantics that may be lost during strong augmentation on the fly.<n>WSCo can be implemented as a generic plug-in, easily integrable with any existing SFOD pipelines.
arXiv Detail & Related papers (2024-10-07T23:32:06Z)
Adjoint Sensitivity Analysis on Multi-Scale Bioprocess Stochastic Reaction Network [2.6130735302655554]
We introduce an adjoint sensitivity approach to expedite the learning of mechanistic model parameters. In this paper, we consider enzymatic analysis (SA) representing a multi-scale bioprocess mechanistic model.
arXiv Detail & Related papers (2024-05-07T05:06:45Z)
CASTLE: Regularization via Auxiliary Causal Graph Discovery [89.74800176981842]
We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.
arXiv Detail & Related papers (2020-09-28T09:49:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.