Fugu-MT 論文翻訳(概要): PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models

論文の概要: PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models

arxiv url: http://arxiv.org/abs/2603.05776v1
Date: Fri, 06 Mar 2026 00:16:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:44.781419
Title: PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models
Title（参考訳）: PVminerLLM:大規模言語モデルを用いた患者生成テキストからの患者音声の構造化抽出
Authors: Samah Fodeh, Linhai Ma, Ganesh Puthiaraju, Srivani Talakokkul, Afshan Khan, Ashley Hagaman, Sarah Lowe, Aimee Roundtree,
Abstract要約: 患者生成テキストには、患者の生きた経験、社会的状況、ケアにおけるエンゲージメントに関する重要な情報が含まれている。これらの患者音声信号は、患者中心の研究や臨床品質改善における使用を制限し、構造化された形で利用されることは滅多にない。患者音声の構造化抽出のためのベンチマークであるPVminerを導入し、教師付き微調整大言語モデルであるPVminerLLMを提案する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Motivation: Patient-generated text contains critical information about patients' lived experiences, social circumstances, and engagement in care, including factors that strongly influence adherence, care coordination, and health equity. However, these patient voice signals are rarely available in structured form, limiting their use in patient-centered outcomes research and clinical quality improvement. Reliable extraction of such information is therefore essential for understanding and addressing non-clinical drivers of health outcomes at scale. Results: We introduce PVminer, a benchmark for structured extraction of patient voice, and propose PVminerLLM, a supervised fine-tuned large language model tailored to this task. Across multiple datasets and model sizes, PVminerLLM substantially outperforms prompt-based baselines, achieving up to 83.82% F1 for Code prediction, 80.74% F1 for Sub-code prediction, and 87.03% F1 for evidence Span extraction. Notably, strong performance is achieved even with smaller models, demonstrating that reliable patient voice extraction is feasible without extreme model scale. These results enable scalable analysis of social and experiential signals embedded in patient-generated text. Availability and Implementation: Code, evaluation scripts, and trained LLMs will be released publicly. Annotated datasets will be made available upon request for research use. Keywords: Large Language Models, Supervised Fine-Tuning, Medical Annotation, Patient-Generated Text, Clinical NLP
Abstract（参考訳）: 動機づけ:患者生成テキストには、患者の生活経験、社会的状況、介護への関与に関する重要な情報が含まれている。しかしながら、これらの患者音声信号は、患者中心の研究や臨床品質改善における使用を制限し、構造化された形で利用されることは滅多にない。したがって、そのような情報の信頼性の高い抽出は、大規模な健康結果の非臨床ドライバーの理解と対処に不可欠である。結果: 患者音声の構造化抽出のためのベンチマークであるPVminerを導入し, この課題に適した教師付き大規模言語モデルであるPVminerLLMを提案する。複数のデータセットとモデルサイズにわたって、PVminerLLMはプロンプトベースベースラインを大幅に上回り、コード予測では83.82% F1、サブコード予測では80.74% F1、証拠Span抽出では87.03% F1に達する。特に、より小さなモデルでも強い性能が得られ、信頼性の高い患者音声抽出が極端なモデルスケールなしで実現可能であることを示す。これらの結果は、患者生成テキストに埋め込まれた社会的および経験的な信号のスケーラブルな分析を可能にする。可用性と実装: コード、評価スクリプト、トレーニング済みのLLMが一般公開される。アノテーション付きデータセットは、研究使用の要求に応じて利用可能になる。キーワード:大規模言語モデル、改善された微調整、医療アノテーション、患者生成テキスト、臨床NLP

論文の概要: PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models

関連論文リスト