Fugu-MT 論文翻訳(概要): Conditional generation of antibody sequences with classifier-guided germline-absorbing discrete diffusion

論文の概要: Conditional generation of antibody sequences with classifier-guided germline-absorbing discrete diffusion

arxiv url: http://arxiv.org/abs/2605.06720v1
Date: Thu, 07 May 2026 06:57:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:38.486894
Title: Conditional generation of antibody sequences with classifier-guided germline-absorbing discrete diffusion
Title（参考訳）: 分化誘導性生殖細胞吸収性離散拡散を有する抗体配列の条件付き生成
Authors: Justin Sanders, Luca Giancardo, Lan Guo, Yue Zhao, Kemal Sonmez, Nina Cheng, Melih Yilmaz,
Abstract要約: 本稿では, 離散拡散ノイズプロセスの新たな改良である, ゲルムリン吸収拡散を導入する。菌根拡散は菌根残基の予測精度を26%から46%に改善することを示した。そこで本研究では,疎水性と結合親和性を予測したサンプリング抗体の条件付き生成タスクにおける生殖細胞拡散モデルの有用性を実証した。
参考スコア（独自算出の注目度）: 4.068045326287419
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Antibody therapeutics are among the most successful modern medicines, yet computationally designing antibodies with desirable binding and developability properties remains challenging. While protein language models (pLMs) have emerged as powerful tools for antibody sequence design, existing approaches largely suffer from two key limitations: they predominantly memorize germline sequences rather than modeling biologically meaningful somatic variation, and they offer limited support for flexible classifier-guided conditional generation. We address these challenges through two primary contributions. First, we demonstrate that discrete diffusion fine-tuning achieves strong language modeling performance on antibody sequences while allowing for generation conditioned on any off-the-shelf classifier. Second, we introduce germline absorbing diffusion, a novel modification of the discrete diffusion noise process in which the germline sequence - rather than a masked sequence - serves as the absorbing state. This biologically motivated inductive bias restricts the model to learning the trajectory from germline to observed sequence, effectively excluding genetic variation and V(D)J recombination statistics from the learned distribution and dramatically mitigating germline bias. We show that germline diffusion improves non-germline residue prediction accuracy from 26 percent to 46 percent, approaching the theoretical upper bound set by true biological variability. We then demonstrate the utility of our germline diffusion model on the conditional generation tasks of sampling antibodies with improved hydrophobicity and predicted binding affinity. On both tasks our model shows an improved tradeoff between class adherence and sample quality, significantly outperforming EvoProtGrad, a popular strategy to sample from pLMs with gradient-based discrete Markov Chain Monte Carlo.
Abstract（参考訳）: 抗体治療は最も成功した現代医学の1つだが、望ましい結合性と開発性を持つ抗体を計算的に設計することは依然として困難である。タンパク質言語モデル(pLM)は、抗体配列設計の強力なツールとして登場したが、既存のアプローチは、生物学的に意味のある体性変異をモデル化するよりも、主に生殖細胞配列を記憶し、柔軟な分類器誘導条件生成のための限定的なサポートを提供するという2つの重要な制限を主に抱えている。主な2つのコントリビューションを通じて、これらの課題に対処する。まず, 離散拡散微調整は, 任意のオフザシェルフ分類器における生成条件を許容しつつ, 抗体配列上での言語モデリング性能を向上することを示した。第2に,ガンマ線が吸収状態として機能する独立した拡散ノイズ過程の新規な改良である,ジェムリン吸収拡散を導入している。この生物学的に動機付けられた誘導バイアスは、生殖細胞から観察された配列への軌道学習にモデルを制限し、遺伝的変異を効果的に排除し、学習分布からのV(D)J組換え統計を効果的に排除し、生殖細胞バイアスを劇的に緩和する。菌根拡散は, 生菌残基の予測精度を26%から46%に改善し, 真の生物学的変動性による理論上界に近づいた。そこで本研究では,疎水性と結合親和性を予測したサンプリング抗体の条件付き生成タスクにおける生殖細胞拡散モデルの有用性を実証した。 EvoProtGradは、勾配に基づく離散的なMarkov Chain Monte Carloを持つpLMからサンプルをサンプリングする一般的な戦略である。

論文の概要: Conditional generation of antibody sequences with classifier-guided germline-absorbing discrete diffusion

関連論文リスト