Fugu-MT 論文翻訳(概要): Retrosynthesis Planning via Worst-path Policy Optimisation in Tree-structured MDPs

論文の概要: Retrosynthesis Planning via Worst-path Policy Optimisation in Tree-structured MDPs

arxiv url: http://arxiv.org/abs/2509.10504v1
Date: Mon, 01 Sep 2025 21:44:14 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-21 06:05:45.760424
Title: Retrosynthesis Planning via Worst-path Policy Optimisation in Tree-structured MDPs
Title（参考訳）: 木構造MDPの最悪の経路政策最適化による再合成計画
Authors: Mianchu Wang, Giovanni Montana,
Abstract要約: 再合成計画は、標的分子を利用可能な構成要素に分解することを目的としており、各内部ノードが中間化合物を表す木を形成する。既存のメソッドは、ブランチ全体の平均パフォーマンスを最適化することが多く、最悪のケースの感度を考慮していない。木MDPと相互作用する対話的再合成計画(InterRetro)を導入し,最悪経路結果の値関数を学習する。 InterRetroは最先端の結果を達成し、Retro*-190ベンチマークの目標の100%を解決し、合成経路を4.9%短縮し、トレーニングデータの10%のみを使用して有望なパフォーマンスを達成する。
参考スコア（独自算出の注目度）: 8.914988066868927
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Retrosynthesis planning aims to decompose target molecules into available building blocks, forming a synthesis tree where each internal node represents an intermediate compound and each leaf ideally corresponds to a purchasable reactant. However, this tree becomes invalid if any leaf node is not a valid building block, making the planning process vulnerable to the "weakest link" in the synthetic route. Existing methods often optimise for average performance across branches, failing to account for this worst-case sensitivity. In this paper, we reframe retrosynthesis as a worst-path optimisation problem within tree-structured Markov Decision Processes (MDPs). We prove that this formulation admits a unique optimal solution and offers monotonic improvement guarantees. Building on this insight, we introduce Interactive Retrosynthesis Planning (InterRetro), a method that interacts with the tree MDP, learns a value function for worst-path outcomes, and improves its policy through self-imitation, preferentially reinforcing past decisions with high estimated advantage. Empirically, InterRetro achieves state-of-the-art results, solving 100% of targets on the Retro*-190 benchmark, shortening synthetic routes by 4.9%, and achieving promising performance using only 10% of the training data - representing a significant advance in computational retrosynthesis planning.
Abstract（参考訳）: 再合成計画は、標的分子を利用可能な構造ブロックに分解することを目的としており、各内部ノードが中間化合物を表し、各葉が購入可能な反応剤に理想的に対応する合成木を形成する。しかし、葉ノードが有効なビルディングブロックでない場合、この木は無効となり、合成経路の「最も弱いリンク」に脆弱な計画プロセスとなる。既存のメソッドは、ブランチ全体の平均パフォーマンスを最適化することが多く、最悪のケースの感度を考慮していない。本稿では,木構造マルコフ決定過程(MDP)において,再合成を最悪の経路最適化問題として再編成する。この定式化が一意の最適解を認め、単調な改善を保証することを証明している。この知見に基づいて, 木MDPと相互作用し, 最悪の経路結果の値関数を学習し, 自己刺激によって政策を改善し, 過去の意思決定を高い評価で優先的に補強する対話的再合成計画(InterRetro)を導入する。実証的な結果として、InterRetroは最先端の結果を達成し、Retro*-190ベンチマークの目標の100%を解決し、合成経路を4.9%短縮し、トレーニングデータの10%しか使用せずに有望なパフォーマンスを達成する。

論文の概要: Retrosynthesis Planning via Worst-path Policy Optimisation in Tree-structured MDPs

関連論文リスト