Fugu-MT 論文翻訳(概要): Bolek: A Multimodal Language Model for Molecular Reasoning

論文の概要: Bolek: A Multimodal Language Model for Molecular Reasoning

arxiv url: http://arxiv.org/abs/2605.02745v1
Date: Mon, 04 May 2026 15:46:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:50.387966
Title: Bolek: A Multimodal Language Model for Molecular Reasoning
Title（参考訳）: Bolek: 分子推論のためのマルチモーダル言語モデル
Authors: Frederic Grabowski, Jacek Szczerbiński, Maciej Jaśkowski, Kalina Jasińska-Kobus, Paweł Dąbrowski-Tumański, Tomasz Jetka, Bartosz Topolski,
Abstract要約: 本稿では,分子構造における自然言語推論の基礎となる,コンパクトな多モーダル言語モデルであるBolekを紹介する。 Bolekは分子記述、RDKit記述子予測、サブ構造検出などの分子アライメントタスクに微調整されている。また、TxGemma-9B-Chatは、そのサイズの半分以下であるにもかかわらず、15のバイナリ分類タスクのうち13でパフォーマンスが向上した。
参考スコア（独自算出の注目度）: 0.39089069256361736
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Molecular property models increasingly support high-stakes drug-discovery decisions, but their outputs are often difficult to audit: classical predictors return scores without rationale, while language models can produce fluent explanations weakly grounded in the input molecule. We introduce Bolek, a compact multimodal language model that grounds natural-language reasoning in molecular structure by injecting a Morgan fingerprint embedding into an instruction-tuned text decoder. Bolek is fine-tuned on molecular alignment tasks, including molecule description, RDKit descriptor prediction, and substructure detection, and on downstream reasoning over 15 TDC binary classification tasks using synthetic chains-of-thought anchored in concrete molecular features. Across these tasks, Bolek outperforms its Qwen3-4B-Instruct base on all endpoints in yes/no mode and on 13 of 15 in chain-of-thought mode, raising mean ROC/PR AUC from 0.55 to 0.76. It also outperforms TxGemma-9B-Chat on 13 of 15 binary classification tasks despite being less than half its size. Bolek's explanations are more grounded than those of the baseline LLMs: it cites numerical descriptors 10-100x more often per chain-of-thought, and the cited values agree strongly with RDKit for key descriptors such as TPSA, MolLogP, and MolWt (Spearman rho = 0.87-0.91). Generalisation extends beyond the training panel: on 15 unseen TDC classification endpoints, Bolek matches TxGemma on five, and it produces non-trivial rank correlations on three held-out regression endpoints despite never seeing downstream regression during training. These results suggest that targeted modality injection and reasoning supervision tied to verifiable molecular features can yield compact, auditable molecular reasoning models.
Abstract（参考訳）: 分子特性モデルは、薬品発見の意思決定をますます支持するが、そのアウトプットは、しばしば監査が難しい:古典的予測子は、合理的にスコアを返すが、言語モデルは、入力分子に弱い基底を持つ流動的な説明を生成することができる。分子構造における自然言語推論の基盤となる,命令付きテキストデコーダにMorganの指紋を注入した,コンパクトなマルチモーダル言語モデルであるBolekを紹介する。 Bolekは分子記述、RDKit記述子予測、サブ構造検出などの分子アライメントタスク、およびコンクリート分子に固定された合成鎖を用いた15のTDCバイナリ分類タスクの下流での推論など、分子アライメントタスクに微調整されている。これらのタスク全体で、BolekはQwen3-4B-インストラクトベースをye/noモードで、そして15点中13点をチェーン・オブ・シントモードで上回り、平均ROC/PR AUCを0.55から0.76に引き上げている。また、TxGemma-9B-Chatは、そのサイズの半分以下であるにもかかわらず、15のバイナリ分類タスクのうち13でパフォーマンスが向上した。 Bolek氏の説明は、ベースライン LLM のそれよりも基礎を成している: 数値記述子 10-100 倍頻繁に考えること、引用された値は、TPSA、MollLogP、MolWt (Spearman rho = 0.87-0.91) のようなキー記述子に対してRDKitと強く一致する。 15のTDC分類エンドポイントでは、Bolekは5つのTxGemmaにマッチし、トレーニング中に下流の回帰を見ることなく、3つの保持された回帰エンドポイントで非自明なランク相関を生成する。これらの結果から, 分子特性の検証に係わる目的のモダリティ注入と推論の監督が, コンパクトで監査可能な分子推論モデルをもたらすことが示唆された。

論文の概要: Bolek: A Multimodal Language Model for Molecular Reasoning

関連論文リスト