Fugu-MT 論文翻訳(概要): Active Flow Expansion for Out-of-Distribution Discovery: from Theory to Molecules

論文の概要: Active Flow Expansion for Out-of-Distribution Discovery: from Theory to Molecules

arxiv url: http://arxiv.org/abs/2606.08802v1
Date: Sun, 07 Jun 2026 19:43:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:06.458204
Title: Active Flow Expansion for Out-of-Distribution Discovery: from Theory to Molecules
Title（参考訳）: アウト・オブ・ディストリビューション発見のためのアクティブフロー展開:理論から分子へ
Authors: Riccardo De Santi, Bruce Lee, Cristian Perez Jensen, Kimon Protopapas, Sophia Tang, Cheng-Hao Liu, Pranam Chatterjee, Yisong Yue, Andreas Krause,
Abstract要約: 本稿では,モデルの生成可能な集合を拡大し,有効設計空間のカバレッジを増大させる,分散フローモデリングのための新しい学習原理を提案する。本稿では, 検証器フィードバックを用いて, 新しい有効領域に事前学習モデルを拡張する, 継続的な事前学習手法であるアクティブフロー拡張(ActFlow)を提案する。その結果、ActFlowは、初期訓練済みモデルによってモデル化された領域を超えて、有効な範囲を広げ、広く採用されている合成フロー事前学習法よりもはるかに優れていることがわかった。
参考スコア（独自算出の注目度）: 62.81692030818376
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Standard flow and diffusion pre-training matches the distribution of available data (e.g., molecules), which often covers only a small fraction of the valid design space. In generative discovery, however, one aims to sample valid new-to-nature designs, assigned negligible probability under, and thus inaccessible to, standard models fitted to the observed data. To overcome this limitation, we depart from data distribution matching and view a generative model through its generable set: the region it covers with non-negligible probability. This allows to introduce a new learning principle for out-of-distribution flow modeling: enlarging a model's generable set to increase coverage of the valid design space. We propose Active Flow Expansion (ActFlow), a continued pre-training method that employs verifier feedback to expand a pre-trained model over new valid regions by iteratively adapting to synthetic data generated through active exploration in the learned flow representation. Theoretically, we establish to our knowledge first-of-their-kind statistical learning guarantees for out-of-distribution flow modeling, analyzing generable set expansion as a local-to-global reachability process over a learned representation. Empirically, we assess ActFlow with suitable out-of-distribution generative modeling metrics across small organic molecules, mid-sized drug-like molecules, therapeutic peptides, and protein sequence design tasks. Results show that ActFlow expands valid coverage far beyond the region modeled by the initial pre-trained model, significantly outperforming widely adopted synthetic flow pre-training methods.
Abstract（参考訳）: 標準流れと拡散事前学習は利用可能なデータ(例えば分子)の分布と一致する。しかし、生成的発見においては、有効な新生設計をサンプル化し、観測されたデータに適合する標準モデルの下で無視可能な確率を割り当てることを目的としている。この制限を克服するため、我々はデータ分散マッチングから離脱し、生成可能な集合を通して生成モデルを見る。これにより、アウト・オブ・ディストリビューション・フロー・モデリングのための新しい学習原則を導入することができる。本研究では,学習フロー表現における能動探索によって生成された合成データに反復的に適応することにより,検証者フィードバックを用いて,新たな有効領域に事前学習モデルを拡張するための,継続的な事前学習手法であるアクティブフロー拡張(ActFlow)を提案する。理論的には,学習表現上の局所的-言語的リーチビリティプロセスとして生成可能な集合拡大を解析し,分布外フローモデリングにおける第一種統計学習の保証を確立する。実験により,ActFlowを,小さな有機分子,中規模の薬物様分子,治療ペプチド,タンパク質配列設計タスクに適するアウト・オブ・ディストリビューション・ジェネレーション・モデリング・メトリクスで評価した。その結果、ActFlowは、初期訓練済みモデルによってモデル化された領域を超えて、有効な範囲を広げ、広く採用されている合成フロー事前学習法よりもはるかに優れていることがわかった。

論文の概要: Active Flow Expansion for Out-of-Distribution Discovery: from Theory to Molecules

関連論文リスト