Fugu-MT 論文翻訳(概要): Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning

論文の概要: Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning

arxiv url: http://arxiv.org/abs/2511.08024v1
Date: Wed, 12 Nov 2025 01:34:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-12 20:17:03.603904
Title: Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning
Title（参考訳）: 複合分子推論のための知識強化長CoT生成
Authors: Tianwen Lyu, Xiang Zhuang, Keyan Ding, Xinzhe Cao, Lei Liang, Wei Zhao, Qiang Zhang, Huajun Chen,
Abstract要約: 生体分子機構は、分子間相互作用、シグナルカスケード、代謝経路の多段階的推論を必要とする。既存のアプローチはしばしばこれらの問題を悪化させる: 推論ステップは生物学的事実から逸脱したり、長い機械的依存関係を捉えるのに失敗する。本稿では,LLMと知識グラフに基づくマルチホップ推論チェーンを統合したLong-CoT推論フレームワークを提案する。
参考スコア（独自算出の注目度）: 51.673503054645415
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding complex biomolecular mechanisms requires multi-step reasoning across molecular interactions, signaling cascades, and metabolic pathways. While large language models(LLMs) show promise in such tasks, their application to biomolecular problems is hindered by logical inconsistencies and the lack of grounding in domain knowledge. Existing approaches often exacerbate these issues: reasoning steps may deviate from biological facts or fail to capture long mechanistic dependencies. To address these challenges, we propose a Knowledge-Augmented Long-CoT Reasoning framework that integrates LLMs with knowledge graph-based multi-hop reasoning chains. The framework constructs mechanistic chains via guided multi-hop traversal and pruning on the knowledge graph; these chains are then incorporated into supervised fine-tuning to improve factual grounding and further refined with reinforcement learning to enhance reasoning reliability and consistency. Furthermore, to overcome the shortcomings of existing benchmarks, which are often restricted in scale and scope and lack annotations for deep reasoning chains, we introduce PrimeKGQA, a comprehensive benchmark for biomolecular question answering. Experimental results on both PrimeKGQA and existing datasets demonstrate that although larger closed-source models still perform well on relatively simple tasks, our method demonstrates clear advantages as reasoning depth increases, achieving state-of-the-art performance on multi-hop tasks that demand traversal of structured biological knowledge. These findings highlight the effectiveness of combining structured knowledge with advanced reasoning strategies for reliable and interpretable biomolecular reasoning.
Abstract（参考訳）: 複雑な生体分子機構を理解するには、分子間相互作用、シグナルカスケード、代謝経路の多段階的推論が必要である。大規模言語モデル(LLM)はそのようなタスクにおいて有望であるが、それらの生体分子問題への応用は論理的不整合とドメイン知識の基盤の欠如によって妨げられる。既存のアプローチはしばしばこれらの問題を悪化させる: 推論ステップは生物学的事実から逸脱したり、長い機械的依存関係を捉えるのに失敗する。これらの課題に対処するために,LLMと知識グラフに基づくマルチホップ推論チェーンを統合したナレッジ強化Long-CoT推論フレームワークを提案する。このフレームワークは、ガイド付きマルチホップ・トラバースと知識グラフによるプルーニングを通じてメカニスティック・チェーンを構築し、これらのチェーンを教師付き微調整に組み込んで、現実のグラウンド化を改善し、さらに強化学習によって洗練し、推論の信頼性と整合性を高める。さらに、しばしば規模や範囲が制限され、深い推論チェーンのアノテーションが欠如している既存のベンチマークの欠点を克服するため、生体分子質問応答のための包括的なベンチマークであるPrimeKGQAを導入する。 PrimeKGQAと既存のデータセットの双方の実験結果から,大規模なクローズドソースモデルは比較的単純なタスクでは依然として良好に動作するが,提案手法は推論深度が増大するにつれて明らかな優位性を示し,構造化された生物学的知識のトラバースを要求するマルチホップタスクにおける最先端のパフォーマンスを実現する。これらの知見は、構造化知識と高度な推論戦略を組み合わせることで、信頼性と解釈可能な生体分子推論の有効性を浮き彫りにした。

論文の概要: Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning

関連論文リスト