Fugu-MT 論文翻訳(概要): Distilled Protein Backbone Generation

論文の概要: Distilled Protein Backbone Generation

arxiv url: http://arxiv.org/abs/2510.03095v1
Date: Fri, 03 Oct 2025 15:25:08 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-06 16:35:52.449787
Title: Distilled Protein Backbone Generation
Title（参考訳）: 蒸留タンパク質のバックボーン生成
Authors: Liyang Xie, Haoran Zhang, Zhendong Wang, Wesley Tansey, Mingyuan Zhou,
Abstract要約: 拡散およびフローベースの生成モデルは、デノボタンパク質の設計に前例のない能力を提供する。これらのモデルは生成速度によって制限され、しばしば逆拡散過程において数百の反復的なステップを必要とする。本研究は,Score Identity Distillation (SiD) を用いて,数段階のタンパク質バックボーン生成装置の訓練を行う方法を示す。蒸留した数段生成装置はサンプリング速度を20倍以上に向上し, 設計性, 多様性, 新規性をProteinaの教師モデルと同等に達成した。
参考スコア（独自算出の注目度）: 59.63474232035653
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion- and flow-based generative models have recently demonstrated strong performance in protein backbone generation tasks, offering unprecedented capabilities for de novo protein design. However, while achieving notable performance in generation quality, these models are limited by their generating speed, often requiring hundreds of iterative steps in the reverse-diffusion process. This computational bottleneck limits their practical utility in large-scale protein discovery, where thousands to millions of candidate structures are needed. To address this challenge, we explore the techniques of score distillation, which has shown great success in reducing the number of sampling steps in the vision domain while maintaining high generation quality. However, a straightforward adaptation of these methods results in unacceptably low designability. Through extensive study, we have identified how to appropriately adapt Score identity Distillation (SiD), a state-of-the-art score distillation strategy, to train few-step protein backbone generators which significantly reduce sampling time, while maintaining comparable performance to their pretrained teacher model. In particular, multistep generation combined with inference time noise modulation is key to the success. We demonstrate that our distilled few-step generators achieve more than a 20-fold improvement in sampling speed, while achieving similar levels of designability, diversity, and novelty as the Proteina teacher model. This reduction in inference cost enables large-scale in silico protein design, thereby bringing diffusion-based models closer to real-world protein engineering applications.
Abstract（参考訳）: 拡散およびフローに基づく生成モデルは、最近、タンパク質のバックボーン生成タスクにおいて強力な性能を示し、デノボタンパク質の設計に前例のない能力を提供している。しかしながら、生成品質において顕著な性能を達成する一方で、これらのモデルは生成速度によって制限され、しばしば逆拡散過程において数百の反復的なステップを必要とする。この計算ボトルネックは、数千から数百万の候補構造を必要とする大規模タンパク質発見における実用性を制限している。この課題に対処するため,高次品質を維持しつつ,視覚領域におけるサンプリングステップ数を減らし,大きな成功を収めたスコア蒸留技術について検討した。しかし、これらの手法の直接的な適応は、許容できる限り低い設計性をもたらす。そこで,本研究では,Score Identity Distillation (SiD) を適宜適用し,事前訓練した教師モデルに匹敵する性能を維持しつつ,サンプリング時間を著しく短縮する数ステップのタンパク質バックボーン生成装置の訓練を行った。特に、推測時間ノイズ変調と組み合わせたマルチステップ生成が成功の鍵となる。蒸留した数段生成装置はサンプリング速度を20倍以上に向上し, 設計性, 多様性, 新規性をProteinaの教師モデルと同等に達成した。この推論コストの削減は、サイリコタンパク質の設計の大規模化を可能にし、現実のタンパク質工学の応用に拡散ベースのモデルをもたらす。

論文の概要: Distilled Protein Backbone Generation

関連論文リスト