Fugu-MT 論文翻訳(概要): MedKGent: A Large Language Model Agent Framework for Constructing Temporally Evolving Medical Knowledge Graph

論文の概要: MedKGent: A Large Language Model Agent Framework for Constructing Temporally Evolving Medical Knowledge Graph

arxiv url: http://arxiv.org/abs/2508.12393v2
Date: Tue, 19 Aug 2025 05:18:31 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-20 13:30:22.886147
Title: MedKGent: A Large Language Model Agent Framework for Constructing Temporally Evolving Medical Knowledge Graph
Title（参考訳）: MedKGent:医学知識グラフを一時的に進化させる大規模言語モデルエージェントフレームワーク
Authors: Duzhen Zhang, Zixiao Wang, Zhong-Zhi Li, Yahan Yu, Shuncheng Jia, Jiahua Dong, Haotian Xu, Xing Wu, Yingying Zhang, Tielin Zhang, Jie Yang, Xiuying Chen, Le Song,
Abstract要約: 我々は、時間的に進化する医療知識グラフを構築するためのフレームワークであるMedKGentを紹介する。生医学的知識の出現を, 微粒な日々の時系列でシミュレートする。結果として得られるKGは156,275個のエンティティと2,971,384個のリレーショナルトリプルを含む。
参考スコア（独自算出の注目度）: 57.54231831309079
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid expansion of medical literature presents growing challenges for structuring and integrating domain knowledge at scale. Knowledge Graphs (KGs) offer a promising solution by enabling efficient retrieval, automated reasoning, and knowledge discovery. However, current KG construction methods often rely on supervised pipelines with limited generalizability or naively aggregate outputs from Large Language Models (LLMs), treating biomedical corpora as static and ignoring the temporal dynamics and contextual uncertainty of evolving knowledge. To address these limitations, we introduce MedKGent, a LLM agent framework for constructing temporally evolving medical KGs. Leveraging over 10 million PubMed abstracts published between 1975 and 2023, we simulate the emergence of biomedical knowledge via a fine-grained daily time series. MedKGent incrementally builds the KG in a day-by-day manner using two specialized agents powered by the Qwen2.5-32B-Instruct model. The Extractor Agent identifies knowledge triples and assigns confidence scores via sampling-based estimation, which are used to filter low-confidence extractions and inform downstream processing. The Constructor Agent incrementally integrates the retained triples into a temporally evolving graph, guided by confidence scores and timestamps to reinforce recurring knowledge and resolve conflicts. The resulting KG contains 156,275 entities and 2,971,384 relational triples. Quality assessments by two SOTA LLMs and three domain experts demonstrate an accuracy approaching 90%, with strong inter-rater agreement. To evaluate downstream utility, we conduct RAG across seven medical question answering benchmarks using five leading LLMs, consistently observing significant improvements over non-augmented baselines. Case studies further demonstrate the KG's value in literature-based drug repurposing via confidence-aware causal inference.
Abstract（参考訳）: 医学文献の急速な拡大は、ドメイン知識を大規模に構築・統合する上での課題が増していることを示している。知識グラフ(KG)は、効率的な検索、自動推論、知識発見を可能にすることで、有望なソリューションを提供する。しかしながら、現在のKG構築法は、一般に、限定的な一般化性や、大規模言語モデル(LLMs)からの帰納的な集約的なアウトプットを持つ教師付きパイプラインに依存し、生物医学的コーパスを静的として扱い、時間的ダイナミクスや進化する知識の文脈的不確実性を無視している。これらの制約に対処するために、時間的に進化する医療用KGを構築するためのLLMエージェントフレームワークであるMedKGentを紹介する。 1975年から2023年にかけて発行された1000万冊以上のPubMed抄録を参考に、我々は詳細な日誌シリーズを通して生物医学的知識の出現をシミュレートした。 MedKGentは、Qwen2.5-32B-Instructモデルを利用した2つの特殊エージェントを使用して、日々KGを段階的に構築する。エクストラクタエージェントは知識のトリプルを識別し、サンプリングベースの推定によって信頼度スコアを割り当て、低信頼度抽出をフィルタリングし、下流処理に通知する。コンストラクタエージェントは、保持されたトリプルを時間的に進化するグラフに段階的に統合し、信頼スコアとタイムスタンプによってガイドされ、繰り返し発生する知識を強化し、矛盾を解決する。結果として得られるKGは156,275個のエンティティと2,971,384個のリレーショナルトリプルを含む。 2つのSOTA LLMと3つのドメインエキスパートによる品質評価は、レイター間合意の強い90%に近づく精度を示している。下流の実用性を評価するために,5つの主要なLCMを用いて7つの医学質問応答ベンチマークでRAGを行い,拡張されていないベースラインに対する顕著な改善を一貫して観察した。ケーススタディは、文献に基づく薬物再精製におけるKGの価値をさらに証明する。

論文の概要: MedKGent: A Large Language Model Agent Framework for Constructing Temporally Evolving Medical Knowledge Graph

関連論文リスト