Fugu-MT 論文翻訳(概要): LLMCARE: Alzheimer's Detection via Transformer Models Enhanced by LLM-Generated Synthetic Data

論文の概要: LLMCARE: Alzheimer's Detection via Transformer Models Enhanced by LLM-Generated Synthetic Data

arxiv url: http://arxiv.org/abs/2508.10027v2
Date: Sun, 17 Aug 2025 16:27:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-19 14:49:10.243619
Title: LLMCARE: Alzheimer's Detection via Transformer Models Enhanced by LLM-Generated Synthetic Data
Title（参考訳）: LLMCARE: LLM生成合成データによる変換器モデルによるアルツハイマーの検出
Authors: Ali Zolnour, Hossein Azadmaleki, Yasaman Haghbin, Fatemeh Taherinezhad, Mohamad Javad Momeni Nezhad, Sina Rashidi, Masoud Khani, AmirSajjad Taleban, Samin Mahdizadeh Sani, Maryam Dadkhah, James M. Noble, Suzanne Bakken, Yadollah Yaghoobzadeh, Abdol-Hossein Vahabie, Masoud Rouhizadeh, Maryam Zolnoori,
Abstract要約: アルツハイマー病と関連する認知症は、米国の約500万人の高齢者に影響を及ぼす。音声に基づく自然言語処理(NLP)は、早期認知低下を検出するための有望でスケーラブルなアプローチを提供する。本研究は, トランスフォーマーの埋め込みと手作り言語的特徴を融合させるスクリーニングパイプラインを開発した。
参考スコア（独自算出の注目度）: 33.0105898172763
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Alzheimer's disease and related dementias (ADRD) affect approximately five million older adults in the U.S., yet over half remain undiagnosed. Speech-based natural language processing (NLP) offers a promising, scalable approach to detect early cognitive decline through linguistic markers. To develop and evaluate a screening pipeline that (i) fuses transformer embeddings with handcrafted linguistic features, (ii) tests data augmentation using synthetic speech generated by large language models (LLMs), and (iii) benchmarks unimodal and multimodal LLM classifiers for ADRD detection. Transcripts from the DementiaBank "cookie-theft" task (n = 237) were used. Ten transformer models were evaluated under three fine-tuning strategies. A fusion model combined embeddings from the top-performing transformer with 110 lexical-derived linguistic features. Five LLMs (LLaMA-8B/70B, MedAlpaca-7B, Ministral-8B, GPT-4o) were fine-tuned to generate label-conditioned synthetic speech, which was used to augment training data. Three multimodal models (GPT-4o, Qwen-Omni, Phi-4) were tested for speech-text classification in zero-shot and fine-tuned settings. The fusion model achieved F1 = 83.3 (AUC = 89.5), outperforming linguistic or transformer-only baselines. Augmenting training data with 2x MedAlpaca-7B synthetic speech increased F1 to 85.7. Fine-tuning significantly improved unimodal LLM classifiers (e.g., MedAlpaca: F1 = 47.3 -> 78.5 F1). Current multimodal models demonstrated lower performance (GPT-4o = 70.2 F1; Qwen = 66.0). Performance gains aligned with the distributional similarity between synthetic and real speech. Integrating transformer embeddings with linguistic features enhances ADRD detection from speech. Clinically tuned LLMs effectively support both classification and data augmentation, while further advancement is needed in multimodal modeling.
Abstract（参考訳）: アルツハイマー病と関連する認知症(ADRD)は、米国で約500万人の高齢者に影響を及ぼすが、半数以上が未診断のままである。音声に基づく自然言語処理(NLP)は、言語マーカーによる早期認知の低下を検出するための、有望でスケーラブルなアプローチを提供する。スクリーニングパイプラインの開発と評価を行う。 (i)手作りの言語的特徴を持つ変圧器の埋め込みを融合させる。 (II)大言語モデル(LLM)による合成音声を用いたデータ拡張テスト、及び 3)ADRD検出のための一様および多モードLLM分類器のベンチマーク。 DementiaBank の "cookie-theft" タスク (n = 237) の文字が使われた。 10個のトランスモデルを3つの微調整戦略で評価した。 110の語彙的言語的特徴を持つトップパフォーマンス変圧器からの融合モデルによる埋め込み 5個のLPM(LLaMA-8B/70B, MedAlpaca-7B, Ministral-8B, GPT-4o)を微調整してラベル付き合成音声を生成し, 訓練データを増強した。 3つのマルチモーダルモデル (GPT-4o, Qwen-Omni, Phi-4) をゼロショットおよび微調整環境での音声テキスト分類実験を行った。融合モデルは F1 = 83.3 (AUC = 89.5) に達し、言語やトランスフォーマーのみのベースラインを上回った。 2x MedAlpaca-7B合成音声による訓練データの増加はF1から85.7に増加した。微調整によりLLM分類器は大幅に改善された(例えば、MedAlpaca: F1 = 47.3 -> 78.5 F1)。現在のマルチモーダルモデルは低い性能を示した(GPT-4o = 70.2 F1; Qwen = 66.0)。合成音声と実音声の分布的類似性に一致した性能向上変圧器埋め込みと言語的特徴の統合により、音声からのADRD検出が向上する。臨床的に調整されたLSMは分類とデータ拡張の両方を効果的にサポートし、マルチモーダルモデリングにはさらなる進歩が必要である。

論文の概要: LLMCARE: Alzheimer's Detection via Transformer Models Enhanced by LLM-Generated Synthetic Data

関連論文リスト