Fugu-MT 論文翻訳(概要): Can open source large language models be used for tumor documentation in Germany? -- An evaluation on urological doctors' notes

論文の概要: Can open source large language models be used for tumor documentation in Germany? -- An evaluation on urological doctors' notes

arxiv url: http://arxiv.org/abs/2501.12106v1
Date: Tue, 21 Jan 2025 12:56:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-01-22 19:37:19.659493
Title: Can open source large language models be used for tumor documentation in Germany? -- An evaluation on urological doctors' notes
Title（参考訳）: オープンソースの大きな言語モデルは、ドイツの腫瘍文書に使えるか?-尿科医のノートの評価
Authors: Stefan Lenz, Arsenij Ustjanzew, Marco Jeray, Torsten Panholzer,
Abstract要約: この評価は、腫瘍ドキュメンテーションプロセスの3つの基本的なタスクについて、11の異なるオープンソース言語モデル(LLM)をテストする。モデルLlama 3.1 8B、Mistral 7B、Mistral NeMo 12Bはタスクにおいて相容れない性能を発揮した。
参考スコア（独自算出の注目度）: 0.13234804008819082
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tumor documentation in Germany is largely done manually, requiring reading patient records and entering data into structured databases. Large language models (LLMs) could potentially enhance this process by improving efficiency and reliability. This evaluation tests eleven different open source LLMs with sizes ranging from 1-70 billion model parameters on three basic tasks of the tumor documentation process: identifying tumor diagnoses, assigning ICD-10 codes, and extracting the date of first diagnosis. For evaluating the LLMs on these tasks, a dataset of annotated text snippets based on anonymized doctors' notes from urology was prepared. Different prompting strategies were used to investigate the effect of the number of examples in few-shot prompting and to explore the capabilities of the LLMs in general. The models Llama 3.1 8B, Mistral 7B, and Mistral NeMo 12 B performed comparably well in the tasks. Models with less extensive training data or having fewer than 7 billion parameters showed notably lower performance, while larger models did not display performance gains. Examples from a different medical domain than urology could also improve the outcome in few-shot prompting, which demonstrates the ability of LLMs to handle tasks needed for tumor documentation. Open source LLMs show a strong potential for automating tumor documentation. Models from 7-12 billion parameters could offer an optimal balance between performance and resource efficiency. With tailored fine-tuning and well-designed prompting, these models might become important tools for clinical documentation in the future. The code for the evaluation is available from https://github.com/stefan-m-lenz/UroLlmEval. We also release the dataset as a new valuable resource that addresses the shortage of authentic and easily accessible benchmarks in German-language medical NLP.
Abstract（参考訳）: ドイツの腫瘍文書は、主に手作業で作成されており、患者の記録を読み、構造化されたデータベースにデータを入力する必要がある。大規模言語モデル(LLM)は、効率性と信頼性を向上させることで、このプロセスを強化する可能性がある。本評価では,腫瘍診断の特定,ICD-10符号の割り当て,初診日の抽出という3つの基本課題に対して,1～70億のモデルパラメータを含む11種類のオープンソースLCMを検証した。これらの課題に基づいてLCMを評価するために, 匿名化された医師の尿路学ノートに基づく注釈付きテキストスニペットのデータセットを作成した。数発のプロンプトにおける実例数の影響を調査し, LLMの一般性を検討するために, 異なるプロンプト戦略を用いた。モデルLlama 3.1 8B、Mistral 7B、Mistral NeMo 12Bはタスクにおいて相容れない性能を発揮した。トレーニングデータが少ないモデルやパラメータが70億未満のモデルでは、パフォーマンスが顕著に低下し、大きなモデルではパフォーマンスが向上しなかった。尿学とは異なる医学領域からの例では、腫瘍の文書化に必要なタスクを扱うLLMの能力を示す、数発のプロンプトの結果を改善することもできる。オープンソースのLLMは、腫瘍文書の自動化に強力な可能性を秘めている。 7〜12億のパラメータのモデルでは、パフォーマンスとリソース効率の最適なバランスが得られます。調整された微調整と適切に設計されたプロンプトによって、これらのモデルは将来、臨床ドキュメントにとって重要なツールになるかもしれない。評価のコードはhttps://github.com/stefan-m-lenz/UroLlmEval.comから入手できる。また、このデータセットを、ドイツ語の医療用NLPにおいて、真正かつ容易にアクセス可能なベンチマークの不足に対処する、新たな貴重なリソースとしてリリースする。

関連論文リスト

ELM: Ensemble of Language Models for Predicting Tumor Group from Pathology Reports [2.0447192404937353]
人口ベースがん登録所(PBCR)は、非構造的病理報告から手動でデータを抽出する際、重大なボトルネックに直面している。我々は,小言語モデル (SLM) と大言語モデル (LLM) の両方を活用する,新しいアンサンブルベースのアプローチであるEMMを紹介する。 ELMは0.94の平均精度とリコールを達成し、シングルモデルとアンサンブルを伴わないアプローチより優れている。
論文参考訳（メタデータ） (2025-03-24T19:21:53Z)
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet [55.39571645315926]
大規模言語モデル(LLM)は、プログラミング言語の構文、意味論、使用パターンを学ぶために、大規模で多様なデータセットに依存している。低リソース言語では、そのようなデータの限られた可用性は、モデルを効果的に一般化する能力を損なう。本稿では,低リソース言語におけるLLMの性能向上のためのいくつかの手法の有効性を実証研究する。
論文参考訳（メタデータ） (2025-01-31T12:23:28Z)
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
本研究では,小言語モデルで処理される条件生成タスクとして,サブグラフ検索タスクをモデル化する。 2億2千万のパラメータからなる基本生成部分グラフ検索モデルでは,最先端モデルと比較して競合検索性能が向上した。 LLMリーダを接続した最大の3Bモデルは、WebQSPとCWQベンチマークの両方で、SOTAのエンドツーエンドパフォーマンスを新たに設定します。
論文参考訳（メタデータ） (2024-10-08T15:22:36Z)
BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine [19.861178160437827]
大規模言語モデル(LLM)は、バイオメディカルおよび医療分野における様々な応用のための重要なリソースとして急速に現れてきた。 textscBiomedRAGは5つのバイオメディカルNLPタスクで優れたパフォーマンスを実現している。 textscBiomedRAG は、GIT と ChemProt コーパスにおいて、マイクロF1スコアが 81.42 と 88.83 の他のトリプル抽出システムより優れている。
論文参考訳（メタデータ） (2024-05-01T12:01:39Z)
Development and Testing of Retrieval Augmented Generation in Large Language Models -- A Case Study Report [2.523433459887027]
Retrieval Augmented Generation (RAG)は、大規模言語モデル(LLM)におけるドメイン知識をカスタマイズするための有望なアプローチとして出現する。 LLM-RAGモデルを35の術前ガイドラインを用いて開発し,人為的反応に対して試験を行った。このモデルでは平均15～20秒で回答が生成され、人間の要求する10分よりもはるかに速くなった。
論文参考訳（メタデータ） (2024-01-29T06:49:53Z)
GlotLID: Language Identification for Low-Resource Languages [51.38634652914054]
GlotLID-M は広い範囲、信頼性、効率性のデシラタを満たす LID モデルである。 1665の言語を識別し、以前の作業に比べてカバー範囲が大幅に増加した。
論文参考訳（メタデータ） (2023-10-24T23:45:57Z)
Local Large Language Models for Complex Structured Medical Tasks [0.0]
本稿では,大規模言語モデルの言語推論機能と,複雑なドメイン特化タスクに取り組むための局所学習の利点を組み合わせたアプローチを提案する。具体的には,病理報告から構造化条件コードを抽出し,そのアプローチを実証する。
論文参考訳（メタデータ） (2023-08-03T12:36:13Z)
Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models [59.89454513692417]
タブラルデータはしばしばテキストに隠され、特に医学的診断報告に使用される。本稿では,TEMED-LLM と呼ばれるテキスト医療報告から構造化表状データを抽出する手法を提案する。本手法は,医学診断における最先端のテキスト分類モデルよりも優れていることを示す。
論文参考訳（メタデータ） (2023-06-08T09:12:28Z)
An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
放射線医学報告の「印象」セクションは、放射線医と他の医師とのコミュニケーションにとって重要な基盤である。近年の研究では、大規模医療用テキストデータを用いた印象自動生成の有望な成果が得られている。これらのモデルは、しばしば大量の医療用テキストデータを必要とし、一般化性能が劣る。
論文参考訳（メタデータ） (2023-04-17T17:13:42Z)
Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models [6.425088990363101]
本研究では, 大規模言語モデルにおけるフラレンシと帰属の関係について検討した。より大きなモデルは、流布と帰属の両方において、より優れた結果をもたらす傾向があることを示す。そこで本研究では,より小さなモデルで大きなモデルとのギャップを埋めることと,トップk検索のメリットを両立できるレシピを提案する。
論文参考訳（メタデータ） (2023-02-11T02:43:34Z)
Augmenting Interpretable Models with LLMs during Training [73.40079895413861]
本稿では,効率よく解釈可能なモデルを構築するための拡張解釈モデル (Aug-imodels) を提案する。 Aug-imodel は、フィッティング時に LLM を使用するが、推論中に使用せず、完全な透明性を実現する。自然言語処理におけるAug-imodelのインスタンス化について検討する: (i) Aug-GAM, (ii) Aug-Tree, (ii) LLM機能拡張による決定木の拡大。
論文参考訳（メタデータ） (2022-09-23T18:36:01Z)
Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts [56.72488923420374]
事前学習型言語モデル (LM) は低リソース環境下での言語間移動に大きな可能性を示している。脳卒中におけるコードミキシング(スペイン・カタラン)臨床ノートの低リソース・実世界の課題を解決するために,NER (name recognition) のためのLMの多言語間転写特性を示す。
論文参考訳（メタデータ） (2022-04-10T21:46:52Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。