Fugu-MT 論文翻訳(概要): Beyond Expression Similarity: Contrastive Learning Recovers Functional Gene Associations from Protein Interaction Structure

論文の概要: Beyond Expression Similarity: Contrastive Learning Recovers Functional Gene Associations from Protein Interaction Structure

arxiv url: http://arxiv.org/abs/2603.20955v1
Date: Sat, 21 Mar 2026 21:36:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-24 19:11:39.166697
Title: Beyond Expression Similarity: Contrastive Learning Recovers Functional Gene Associations from Protein Interaction Structure
Title（参考訳）: 表現類似性を超えて:コントラスト学習はタンパク質相互作用構造から機能的遺伝子関連を回復する
Authors: Jason Dury,
Abstract要約: Predictive Associative Memory (PAM)フレームワークは、有用な関係が共有コンテキストで共有されることが多いことを示唆している。この原理が、タンパク質-結合相互作用が機能的関連をもたらす分子生物学に転移するかどうかを検証する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Predictive Associative Memory (PAM) framework posits that useful relationships often connect items that co-occur in shared contexts rather than items that appear similar in embedding space. A contrastive MLP trained on co-occurrence annotations--Contrastive Association Learning (CAL)--has improved multi-hop passage retrieval and discovered narrative function at corpus scale in text. We test whether this principle transfers to molecular biology, where protein-protein interactions provide functional associations distinct from gene expression similarity. Four experiments across two biological domains map the operating envelope. On gene perturbation data (Replogle K562 CRISPRi, 2,285 genes), CAL trained on STRING protein interactions achieves cross-boundary AUC of 0.908 where expression similarity scores 0.518. A second gene dataset (DepMap, 17,725 genes) confirms the result after negative sampling correction, reaching cross-boundary AUC of 0.947. Two drug sensitivity experiments produce informative negatives that sharpen boundary conditions. Three cross-domain findings emerge: (1) inductive transfer succeeds in biology--a node-disjoint split with unseen genes yields AUC 0.826 (Delta +0.127)--where it fails in text (+/-0.10), suggesting physically grounded associations are more transferable than contingent co-occurrences; (2) CAL scores anti-correlate with interaction degree (Spearman r = -0.590), with gains concentrating on understudied genes with focused interaction profiles; (3) tighter association quality outperforms larger but noisier training sets, reversing the text pattern. Results are stable across training seeds (SD < 0.001) and cross-boundary threshold choices.
Abstract（参考訳）: Predictive Associative Memory (PAM) フレームワークは、組み込みスペースに類似したアイテムではなく、共有コンテキストで共起するアイテムを、有用な関係で接続する。コントラッシブ・アソシエーション・ラーニング (CAL) は, マルチホップ・パスの検索を改善し, コーパススケールでの物語機能を発見した。この原理が分子生物学に転移するかどうかを検証し、タンパク質とタンパク質の相互作用は遺伝子発現の類似性とは異なる機能的関連を提供する。 2つの生物ドメインにわたる4つの実験は、操作エンベロープをマッピングする。遺伝子摂動データ(Replogle K562 CRISPRi, 2,285遺伝子)では、STRINGタンパク質の相互作用を訓練したCALは、発現類似性スコア0.518の0.908の有界AUCを達成する。第2の遺伝子データセット(DepMap, 17,725遺伝子)は、負のサンプリング補正を経て、0.947の有界AUCに達する。 2つの薬物感受性実験は境界条件を鋭くする情報陰性を生成する。 AUC 0.826 (Delta +0.127)- テキスト (+/-0.10) で失敗すると、物理的に接地された関連は、一致した共起体よりも伝達しやすく、(2) CALスコアは相互作用の度合い(Spearman r = -0.590)と反相関し、焦点を絞った相互作用プロファイルを持つ下層調査された遺伝子に集中し、(3) より厳密な関連性は、より大きながノイズの多いトレーニングセットより優れ、テキストパターンを逆転させる。結果は訓練種子 (SD < 0.001) と有界しきい値選択で安定である。

論文の概要: Beyond Expression Similarity: Contrastive Learning Recovers Functional Gene Associations from Protein Interaction Structure

関連論文リスト