Fugu-MT 論文翻訳(概要): Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery

論文の概要: Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery

arxiv url: http://arxiv.org/abs/2605.21522v1
Date: Tue, 19 May 2026 04:14:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-22 16:35:41.925487
Title: Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery
Title（参考訳）: タンパク質思考:タンパク質-タンパク質相互作用発見のための思考木と埋め込み空間フローマッチングの解釈可能な推論
Authors: Kingsley Yeon, Xuefeng Liu, Promit Ghosal,
Abstract要約: 我々は、明示的な推論を伴う解釈可能な探索問題として、PPI発見を再構成するフレームワークである textbfProtein Thoughts を提案する。タンパク質思考は、エントロピー木探索ベースラインにおいて平均11.2対47.7の平均的バインダーランクを達成し、76%の改善、および結合予測のために、訓練された値関数は91.08 pm 0.19$ Micro-F1に達する。
参考スコア（独自算出の注目度）: 9.70057752523548
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Protein-protein interactions (PPIs) govern nearly all cellular processes, yet computational methods for identifying binding partners typically produce ranked predictions without mechanistic justification. This creates a fundamental barrier to adoption because biologists cannot assess whether predictions reflect genuine biochemical insight or spurious correlations. We present \textbf{Protein Thoughts}, a framework that reformulates PPI discovery as an interpretable search problem with explicit reasoning. The system decomposes binding evidence into four biologically meaningful signals: sequence similarity reflecting evolutionary relationships, structural complementarity capturing geometric fit, interface balance, and chemical compatibility encoding residue-level interactions. Rather than collapsing these signals into an opaque score, we preserve their individual contributions through a transparent value function that enables both ranking and auditing. To navigate large candidate spaces efficiently, we introduce hypothesis-guided entropy-regularized Tree-of-Thoughts search. A fine-tuned language model generates search directives from embedding-derived features, classifying candidates as high-priority, exploratory, or skippable. These directives condition a Boltzmann policy that balances exploitation with entropy-driven exploration, while hypothesis-aware pruning prevents premature abandonment of promising candidates. For candidates exhibiting score disagreement, hypothesis-conditioned embedding-space flow matching transports protein embeddings toward the binder manifold. On the SHS148k benchmark, Protein Thoughts achieves mean best-binder rank of 11.2 versus 47.7 for an entropic tree search baseline, a 76% improvement, and for binding prediction the trained value function achieves $91.08 \pm 0.19$ Micro-F1, outperforming existing PPI methods on the same dataset.
Abstract（参考訳）: タンパク質とタンパク質の相互作用(PPI)は、ほとんど全ての細胞プロセスを支配するが、結合パートナーを特定する計算手法は通常、機械的正当化なしにランク付けされた予測を生成する。生物学者は、予測が本物の生化学的洞察や刺激的な相関を反映するかどうかを判断できないため、このことが採用の根本的な障壁となる。本稿では, PPI発見を明示的推論による解釈可能な探索問題として再定義するフレームワークである, textbf{Protein Thoughts}を提案する。このシステムは、結合証拠を生物学的に意味のある4つの信号に分解する: 進化的関係を反映する配列類似性、幾何学的適合を捉える構造相補性、界面バランス、残基レベルの相互作用をコードする化学相補性。これらの信号を不透明なスコアに分解するのではなく、ランキングと監査の両方を可能にする透明な値関数を通じて、個々のコントリビューションを保存する。大規模候補空間を効率的にナビゲートするために,仮説誘導型エントロピー規則化ツリー・オブ・サーチを導入する。微調整言語モデルは、埋め込み型特徴から検索ディレクティブを生成し、候補を高優先度、探索的、あるいはスキップ可能なものに分類する。これらの指示は、エントロピー駆動の探索と搾取のバランスをとるボルツマンのポリシーを条件とし、仮説を意識したプルーニングは、有望な候補の早期放棄を妨げている。スコア不一致を示す候補に対して、仮説条件の埋め込み空間フローマッチングは、タンパク質の埋め込みをバインダー多様体へ輸送する。 SHS148kベンチマークにおいて、タンパク質思考は、エントロピックツリー検索ベースラインにおいて平均11.2対47.7のベストバインダーランクを達成し、76%の改善、および結合予測のために、トレーニングされた値関数は91.08 \pm 0.19$ Micro-F1を達成し、同じデータセット上で既存のPPIメソッドより優れている。

論文の概要: Protein Thoughts: Interpretable Reasoning with Tree of Thoughts and Embedding-Space Flow Matching for Protein-Protein Interaction Discovery

関連論文リスト