Fugu-MT 論文翻訳(概要): ZeroFold: Protein-RNA Binding Affinity Predictions from Pre-Structural Embeddings

論文の概要: ZeroFold: Protein-RNA Binding Affinity Predictions from Pre-Structural Embeddings

arxiv url: http://arxiv.org/abs/2603.23583v1
Date: Tue, 24 Mar 2026 15:14:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-26 21:06:10.95958
Title: ZeroFold: Protein-RNA Binding Affinity Predictions from Pre-Structural Embeddings
Title（参考訳）: ZeroFold: 構造前埋め込みによるタンパク質-RNA結合親和性予測
Authors: Josef Hanke, Sebastian Pujalte Ojeda, Shengyu Zhang, Werngard Czechtizky, Leonardo De Maria, Michele Vendruscolo,
Abstract要約: タンパク質-RNA結合親和性の正確な予測は、構造生物学において未解決の問題である。ここでは, 既設の埋設物を抽出することにより, この障害に対処できることを示す。我々はZeroFoldというトランスフォーマーをベースとしたモデルを構築し、Boltz-2からタンパク質とRNA分子の両方に組み込む。
参考スコア（独自算出の注目度）: 7.1857665261026575
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The accurate prediction of protein-RNA binding affinity remains an unsolved problem in structural biology, limiting opportunities in understanding gene regulation and designing RNA-targeting therapeutics. A central obstacle is the structural flexibility of RNA, as, unlike proteins, RNA molecules exist as dynamic conformational ensembles. Thus, committing to a single predicted structure discards information relevant to binding. Here, we show that this obstacle can be addressed by extracting pre-structural embeddings, which are intermediate representations from a biomolecular foundation model captured before the structure decoding step. Pre-structural embeddings implicitly encode conformational ensemble information without requiring predicted structures. We build ZeroFold, a transformer-based model that combines pre-structural embeddings from Boltz-2 for both protein and RNA molecules through a cross-modal attention mechanism to predict binding affinity directly from sequence. To support training and evaluation, we construct PRADB, a curated dataset of 2,621 unique protein-RNA pairs with experimentally measured affinities drawn from four complementary databases. On a held-out test set constructed with 40% sequence identity thresholds, ZeroFold achieves a Spearman correlation of 0.65, a value approaching the ceiling imposed by experimental measurement noise. Under progressively fairer evaluation conditions that control for training-set overlap, ZeroFold compares favourably with respect to leading structure-based and leading sequence-based predictors, with the performance gap widening as sequence similarity to competitor training data is reduced. These results illustrate how pre-structural embeddings offer a representation strategy for flexible biomolecules, opening a route to affinity prediction for protein-RNA pairs for which no structural data exist.
Abstract（参考訳）: タンパク質-RNA結合親和性の正確な予測は、構造生物学において未解決の問題であり、遺伝子制御を理解し、RNA標的治療を設計する機会を制限している。中心的な障害は、タンパク質とは異なり、RNA分子が動的コンフォメーションアンサンブルとして存在するように、RNAの構造的柔軟性である。したがって、単一の予測構造へのコミットは、バインディングに関連する情報を破棄する。本稿では, 生体分子基盤モデルから中間表現であるプレ構造埋め込みを抽出することにより, この障害に対処できることを示す。事前構造埋め込みは、予測された構造を必要としない構造的アンサンブル情報を暗黙的にエンコードする。我々は、ZeroFoldというトランスフォーマーベースのモデルを構築し、Boltz-2からタンパク質とRNA分子の両方へのプレ構造埋め込みを、配列から直接結合親和性を予測するために、クロスモーダルなアテンション機構を通じて組み合わせた。トレーニングと評価を支援するために,4つの相補的データベースから抽出された親和性を実験的に測定した2,621個のタンパク質-RNA対のキュレートデータセットであるPRADBを構築した。 40%のシーケンスアイデンティティしきい値で構成されたホールトアウトテストセットでは、ZeroFoldは、実験的な測定ノイズによって課される天井に近づく値である0.65のスピアマン相関を達成する。トレーニングセットの重複に対する制御を段階的に公平に評価する条件下では、ZeroFoldは、先行する構造ベースおよび先行するシーケンスベースの予測器と比較して好意的に比較し、競合するトレーニングデータとのシーケンス類似性により、パフォーマンスギャップが拡大する。これらの結果は、構造前埋め込みが柔軟な生体分子の表現戦略を提供し、構造データが存在しないタンパク質-RNA対の親和性予測への道を開くことを示している。

論文の概要: ZeroFold: Protein-RNA Binding Affinity Predictions from Pre-Structural Embeddings

関連論文リスト