Fugu-MT 論文翻訳(概要): Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design

論文の概要: Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design

arxiv url: http://arxiv.org/abs/2410.13643v1
Date: Thu, 17 Oct 2024 15:10:13 GMT
ステータス: 翻訳完了
システム内更新日: 2024-11-28 17:07:36.982751
Title: Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
Title（参考訳）: 逆最適化による微調整離散拡散モデルとDNAおよびタンパク質設計への応用
Authors: Chenyu Wang, Masatoshi Uehara, Yichun He, Amy Wang, Tommaso Biancalani, Avantika Lal, Tommi Jaakkola, Sergey Levine, Hanchen Wang, Aviv Regev,
Abstract要約: 拡散モデルにより生成された軌道全体を通して報酬の直接バックプロパゲーションを可能にするアルゴリズムを提案する。 DRAKESは自然に似ており、高い報酬をもたらすシーケンスを生成することができる。
参考スコア（独自算出の注目度）: 56.957070405026194
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent studies have demonstrated the strong empirical performance of diffusion models on discrete sequences across domains from natural language to biological sequence generation. For example, in the protein inverse folding task, conditional diffusion models have achieved impressive results in generating natural-like sequences that fold back into the original structure. However, practical design tasks often require not only modeling a conditional distribution but also optimizing specific task objectives. For instance, we may prefer protein sequences with high stability. To address this, we consider the scenario where we have pre-trained discrete diffusion models that can generate natural-like sequences, as well as reward models that map sequences to task objectives. We then formulate the reward maximization problem within discrete diffusion models, analogous to reinforcement learning (RL), while minimizing the KL divergence against pretrained diffusion models to preserve naturalness. To solve this RL problem, we propose a novel algorithm, DRAKES, that enables direct backpropagation of rewards through entire trajectories generated by diffusion models, by making the originally non-differentiable trajectories differentiable using the Gumbel-Softmax trick. Our theoretical analysis indicates that our approach can generate sequences that are both natural-like and yield high rewards. While similar tasks have been recently explored in diffusion models for continuous domains, our work addresses unique algorithmic and theoretical challenges specific to discrete diffusion models, which arise from their foundation in continuous-time Markov chains rather than Brownian motion. Finally, we demonstrate the effectiveness of DRAKES in generating DNA and protein sequences that optimize enhancer activity and protein stability, respectively, important tasks for gene therapies and protein-based therapeutics.
Abstract（参考訳）: 近年の研究では、自然言語から生物学的配列生成までの領域にわたる離散配列上での拡散モデルの強い経験的性能が実証されている。例えば、タンパク質の逆フォールディングタスクでは、条件付き拡散モデルは、元の構造に折り返される自然のような配列を生成するという印象的な結果を得た。しかし、実用的な設計タスクは条件分布をモデル化するだけでなく、特定のタスクの目的を最適化する必要があることが多い。例えば、高い安定性を持つタンパク質配列を好むかもしれない。そこで本研究では、自然に類似した配列を生成する離散拡散モデルと、課題対象に配列をマッピングする報酬モデルが事前に学習されているシナリオについて考察する。次に、離散拡散モデルにおける報酬最大化問題を、強化学習(RL)と同様に定式化し、KL分散を事前学習した拡散モデルに対して最小化し、自然性を維持する。このRL問題を解くために,Gumbel-Softmax のトリックを用いて,拡散モデルにより生成された全軌跡による報酬の直接バックプロパゲーションを可能にする新しいアルゴリズム DRAKES を提案する。我々の理論分析は、本手法が自然に類似し、高い報酬をもたらすシーケンスを生成可能であることを示唆している。同様のタスクは、最近連続領域の拡散モデルにおいて研究されているが、我々の研究は、ブラウン運動ではなく、連続時間マルコフ連鎖の基盤から生じる離散拡散モデルに特有の独自のアルゴリズム的および理論的課題に対処している。最後に, 遺伝子治療およびタンパク質ベースの治療において, プロテアーゼ活性およびタンパク質安定性を最適化するDNAおよびタンパク質配列の生成におけるDRAKESの有効性を示す。

論文の概要: Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design

関連論文リスト