Fugu-MT 論文翻訳(概要): Toward Better Geometric Representations for Molecule Generative Models

論文の概要: Toward Better Geometric Representations for Molecule Generative Models

arxiv url: http://arxiv.org/abs/2605.07693v1
Date: Fri, 08 May 2026 13:02:58 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:39.056133
Title: Toward Better Geometric Representations for Molecule Generative Models
Title（参考訳）: 分子生成モデルのための幾何学的表現の改善に向けて
Authors: Shaoheng Yan, Zian Li, Cai Zhou, Qiaojing Huang, Kai Liu, Muhan Zhang,
Abstract要約: LENSEsは、表現条件付き生成法における分子表現の可能性をうまく活用するフレームワークである。分子生成タスクによるこれらの改善の有効性を実証する。
参考スコア（独自算出の注目度）: 34.04020604759628
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Geometric representation-conditioned molecule generation provides an effective paradigm that decouples molecule representation modeling from structure generation. By decoupling molecule generation into two stages-first generating a meaningful molecule representation, and then generating a 3D molecule conditioned on this representation-the efficiency and quality of the generation process can be significantly enhanced. However, its effectiveness is fundamentally limited by the quality of the representation space: pretrained molecular encoders, such as UniMol, produce representations that are non-smooth and not fully exploited during the generative training process. In this work, we propose LENSEs, a framework that better exploits the potential of molecule representations in representation-conditioned generation methods. In particular, LENSEs introduces three complementary mechanisms: (1) a representation head, simultaneously trained during generative tasks, that extracts multi-level representations from the pretrained encoder; (2) a molecule perceptual loss that optimizes the generator in a semantic-informative representation space; and (3) a node-level representation alignment (REPA) loss that explicitly aligns the generator's hidden states with encoder representations, reducing the semantic gap between pretraining and generation. We demonstrate the effectiveness of these improvements through extensive molecule generation tasks. Specifically, on the challenging molecule generation dataset GEOM-DRUG, LENSEs achieves 97.28% validity and 98.51% molecule stability, surpassing existing advanced methods. Further analyses through Lipschitz constant reduction (4.6x) and QM9 probing tasks also demonstrate the smoother, more informative refined representations, establishing generative training with alignment objectives as a potential pretraining paradigm for molecular encoders.
Abstract（参考訳）: 幾何学的表現条件付き分子生成は、分子表現モデリングを構造生成から切り離す効果的なパラダイムを提供する。分子生成を2段階に分離して有意義な分子表現を生成し、この表現に基づいて条件付けられた3D分子を生成することにより、生成プロセスの効率と品質を著しく向上させることができる。しかし、その効果は表現空間の質によって基本的に制限されており、UniMolのような事前訓練された分子エンコーダは、生成的学習過程において非滑らかで完全に活用されていない表現を生成する。本研究では,表現条件付き生成法における分子表現の可能性をよりよく活用するフレームワークであるLENSEsを提案する。特に、LENSEsは、(1)事前訓練されたエンコーダから多レベル表現を抽出する表現ヘッド、(2)意味的不変表現空間においてジェネレータを最適化する分子知覚的損失、(3)ジェネレータの隠れ状態とエンコーダ表現を明示的に整列するノードレベル表現アライメントアライメントアライメント(REPA)の3つの相補的なメカニズムを導入し、前訓練と生成の間の意味的ギャップを小さくする。分子生成タスクによるこれらの改善の有効性を実証する。具体的には、挑戦的な分子生成データセットGEOM-DRUGにおいて、LENSEsは97.28%の妥当性と98.51%の分子安定性を達成し、既存の高度な手法を超越している。リプシッツ定数減少(4.6x)およびQM9探索タスクによるさらなる解析により、より滑らかでより情報に富んだ表現が示され、分子エンコーダの潜在的な事前訓練パラダイムとしてアライメント目的による生成的トレーニングが確立された。

論文の概要: Toward Better Geometric Representations for Molecule Generative Models

関連論文リスト