Fugu-MT 論文翻訳(概要): CXR-LLAVA: a multimodal large language model for interpreting chest X-ray images

論文の概要: CXR-LLAVA: a multimodal large language model for interpreting chest X-ray images

arxiv url: http://arxiv.org/abs/2310.18341v3
Date: Sun, 14 Jan 2024 13:29:15 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-18 00:57:03.031496
Title: CXR-LLAVA: a multimodal large language model for interpreting chest X-ray images
Title（参考訳）: cxr-llava:胸部x線画像解釈のためのマルチモーダル大言語モデル
Authors: Seowoo Lee, Jiwon Youn, Hyungjin Kim, Mansu Kim, Soon Ho Yoon
Abstract要約: 本研究の目的は,胸部X線画像(CXR)を解釈するためのオープンソースのマルチモーダル大言語モデル(CXR-LLAVA)を開発することである。トレーニングでは,592,580個のCXRを収集し,そのうち374,881個のX線写真異常のラベルが得られた。主な病理所見に対する診断成績と,ヒト放射線技師による放射線学的報告の受容性について検討した。
参考スコア（独自算出の注目度）: 3.0757789554622597
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Purpose: This study aimed to develop an open-source multimodal large language model (CXR-LLAVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists Materials and Methods: For training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2). After pre-training a vision transformer with Dataset 1, we integrated it with an LLM influenced by the LLAVA network. Then, the model was fine-tuned, primarily using Dataset 2. The model's diagnostic performance for major pathological findings was evaluated, along with the acceptability of radiologic reports by human radiologists, to gauge its potential for autonomous reporting. Results: The model demonstrated impressive performance in test sets, achieving an average F1 score of 0.81 for six major pathological findings in the MIMIC internal test set and 0.62 for seven major pathological findings in the external test set. The model's F1 scores surpassed those of GPT-4-vision and Gemini-Pro-Vision in both test sets. In human radiologist evaluations of the external test set, the model achieved a 72.7% success rate in autonomous reporting, slightly below the 84.0% rate of ground truth reports. Conclusion: This study highlights the significant potential of multimodal LLMs for CXR interpretation, while also acknowledging the performance limitations. Despite these challenges, we believe that making our model open-source will catalyze further research, expanding its effectiveness and applicability in various clinical contexts. CXR-LLAVA is available at https://github.com/ECOFRI/CXR_LLAVA.
Abstract（参考訳）: Purpose: This study aimed to develop an open-source multimodal large language model (CXR-LLAVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists Materials and Methods: For training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2). ビジョントランスをDataset 1で事前学習した後、LLAVAネットワークに影響されたLLMと統合した。その後、モデルを微調整し、主にDataset 2.0を使用した。本モデルによる病理所見の診断成績は,ヒト放射線学者による放射線学的報告の受容性とともに評価された。結果: 実験群では, MIMIC内部試験群では6例で平均F1スコア0.81, 外部試験群では7例で0.62, 平均F1スコア0.81が得られた。 F1のスコアは両方のテストセットでGPT-4ビジョンとジェミニ-プロビジョンを上回った。ヒトの放射線技師による外部検査セットの評価では、このモデルは自律的な報告で72.7%の成功率を達成し、基礎的真理の84.0%をわずかに下回った。結論: 本研究は, CXR 解釈におけるマルチモーダル LLM の有意な可能性を示すとともに, 性能制限も認めている。これらの課題にもかかわらず、我々のモデルをオープンソースにすることはさらなる研究を触媒し、様々な臨床状況においてその有効性と適用性を広げるであろうと信じている。 CXR-LLAVAはhttps://github.com/ECOFRI/CXR_LLAVAで入手できる。

関連論文リスト

Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation [16.19033330311087]
画像品質評価における大規模言語モデル (LLM) の評価と標準化の報告を行う。 Gemini 2.0-Flash は CXR タスクの Macro F1 スコアを90点達成し、強力な一般化を示したが、細かい性能は制限された。 DeepSeek-R1はCTで62.23%のリコール率で評価され、他のモデルよりも優れていた。
論文参考訳（メタデータ） (2025-03-10T08:16:18Z)
Fast-staged CNN Model for Accurate pulmonary diseases and Lung cancer detection [0.0]
本研究は, 肺がん, 特に肺結節の検出を目的とした深層学習モデルと, 胸部X線写真を用いた8つの肺病理組織について検討した。アンサンブル法とトランスファーラーニングを利用した2段階分類システムを用いて,最初のトリアージ画像を正規あるいは異常に分類する。このモデルでは、最高の性能の精度は77%、感度は0.713、特異度は0.776、AUCスコアは0.888である。
論文参考訳（メタデータ） (2024-12-16T11:47:07Z)
MGH Radiology Llama: A Llama 3 70B Model for Radiology [50.42811030970618]
本稿では,高度な放射線学に焦点を当てた大規模言語モデルMGH Radiology Llamaを提案する。 Llama 3 70Bモデルを使用して開発され、Radiology-GPTやRadiology-Llama2といった従来のドメイン固有モデルをベースにしている。従来の指標とGPT-4に基づく評価の両方を取り入れた評価では,汎用LLMよりも高い性能を示す。
論文参考訳（メタデータ） (2024-08-13T01:30:03Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
オープンソースの小型マルチモーダルモデル(SMM)を訓練し、放射線学における未測定臨床ニーズに対する能力ギャップを埋める。トレーニングのために,697万以上の画像テキストペアからなる大規模なデータセットを組み立てる。評価のために,GPT-4に基づく実測値CheXpromptを提案する。 LlaVA-Radの推論は高速で、単一のV100 GPU上でプライベート設定で実行できる。
論文参考訳（メタデータ） (2024-03-12T18:12:02Z)
MAIRA-1: A specialised large multimodal model for radiology report generation [41.69727330319648]
胸部X線(CXR)から放射線学的レポートを生成するための放射線学固有のマルチモーダルモデルを提案する。我々の研究は、学習済みの視覚エンコーダとアライメントすることで、大規模言語モデルにマルチモーダル機能を持たせることができるという考えに基づいている。提案モデル(MAIRA-1)は,Vicuna-7Bに基づく微調整された大規模言語モデルと協調してCXR固有の画像エンコーダを利用して,最先端の品質のレポートを生成する。
論文参考訳（メタデータ） (2023-11-22T19:45:40Z)
ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data [115.0747462486285]
ChatRadio-Valuerは、一般化可能な表現を学習する自動放射線学レポート生成のための調整されたモデルである。本研究で利用した臨床データセットは,textbf332,673の顕著な総計を含む。 ChatRadio-Valuerは、最先端のモデル、特にChatGPT(GPT-3.5-Turbo)やGPT-4などより一貫して優れている。
論文参考訳（メタデータ） (2023-10-08T17:23:17Z)
Radiology-Llama2: Best-in-Class Large Language Model for Radiology [71.27700230067168]
本稿では,ラジオロジーに特化した大規模言語モデルであるRadiology-Llama2を紹介する。 MIMIC-CXRとOpenIデータセットのROUGEメトリクスを用いた定量的評価は、Radiology-Llama2が最先端のパフォーマンスを達成することを示す。
論文参考訳（メタデータ） (2023-08-29T17:44:28Z)
Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation [7.586632627817609]
放射線学者は、解釈と報告を必要とする胸部X線(CXR)の量の増加のために、高いバーンアウト率に直面している。提案するCXRレポートジェネレータは,ワークフローの要素を統合し,強化学習のための新たな報酬を導入する。本研究の結果から, 提案モデルでは, 最新技術モデルよりも, 放射線学者の報告に適合した報告が生成されることがわかった。
論文参考訳（メタデータ） (2023-07-19T05:41:14Z)
Medical Image Captioning via Generative Pretrained Transformers [57.308920993032274]
我々は、Show-Attend-Tell と GPT-3 という2つの言語モデルを組み合わせて、包括的で記述的な放射線学記録を生成する。提案モデルは、Open-I、MIMIC-CXR、汎用MS-COCOの2つの医療データセットで検証される。
論文参考訳（メタデータ） (2022-09-28T10:27:10Z)
Open-radiomics: A Collection of Standardized Datasets and a Technical Protocol for Reproducible Radiomics Machine Learning Pipelines [0.0]
オープンラジオミクス、一連のラジオミクスデータセット、包括的なラジオミクスパイプラインを紹介する。 BraTS 2020オープンソースMR(Magnetic Resonance Imaging)データセットで実験が行われた。 binWidthや画像正規化とは異なり,腫瘍の亜領域と画像の配列はモデルの性能に大きく影響した。
論文参考訳（メタデータ） (2022-07-29T16:37:46Z)
Event-based clinical findings extraction from radiology reports with pre-trained language model [0.22940141855172028]
今回,臨床所見を付加した新しい放射線診断報告のコーパスを報告する。金の標準コーパスには合計500点の注記CTレポートが含まれていた。 BERTを含む2つの最先端ディープラーニングアーキテクチャを用いて、トリガと引数のエンティティを抽出した。
論文参考訳（メタデータ） (2021-12-27T05:03:10Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
ディープラーニングは、病気の識別性能を改善するための最も強力なコンピュータ支援診断技術となった。胸部X線撮影では、大規模データの注釈付けには専門的なドメイン知識が必要で、時間を要する。本論文では、単一モデルにおける疾患同定性能を改善するために、複数対1の分布学習(MODL)とK-nearest neighbor smoothing(KNNS)手法を提案する。
論文参考訳（メタデータ） (2021-02-26T02:29:30Z)
Chest x-ray automated triage: a semiologic approach designed for clinical implementation, exploiting different types of labels through a combination of four Deep Learning architectures [83.48996461770017]
本研究では,異なる畳み込みアーキテクチャの後期融合に基づく深層学習手法を提案する。公開胸部x線画像と機関アーカイブを組み合わせたトレーニングデータセットを4つ構築した。 4つの異なるディープラーニングアーキテクチャをトレーニングし、それらのアウトプットとレイトフュージョン戦略を組み合わせることで、統一されたツールを得ました。
論文参考訳（メタデータ） (2020-12-23T14:38:35Z)
Exploration of Interpretability Techniques for Deep COVID-19 Classification using Chest X-ray Images [10.01138352319106]
5種類のディープラーニングモデル(ResNet18、ResNet34、InceptionV3、InceptionResNetV2、DenseNet161)とそれらのEnsembleは、Chest X-Ray画像を用いて、新型コロナウイルス、肺炎、健康な被験者を分類するために使用されている。新型コロナウイルスの分類における平均的なMicro-F1スコアは0.66から0.875の範囲で、ネットワークモデルのアンサンブルは0.89である。
論文参考訳（メタデータ） (2020-06-03T22:55:53Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。