Fugu-MT 論文翻訳(概要): Benchmarking Foundation Models for Renal Lesion Stratification in CT

論文の概要: Benchmarking Foundation Models for Renal Lesion Stratification in CT

arxiv url: http://arxiv.org/abs/2605.07749v1
Date: Fri, 08 May 2026 13:56:11 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:39.085346
Title: Benchmarking Foundation Models for Renal Lesion Stratification in CT
Title（参考訳）: CTにおける腎病変階層化のためのベンチマーク基礎モデル
Authors: Hartmut Häntze, Sarah de Boer, Myrthe Buser, Alessa Hering, Bram van Ginneken, Mathias Prokop, Jawed Nawabi, Sebastian Ziegelmayer, Lisa Adams, Keno Bressem,
Abstract要約: オープンソース医療基盤モデル(FM)は、医療画像モデルのトレーニングに使用することができる。ここでは、FMと放射能を比較し、3D ResNet-50をゼロから訓練する。 FMは腎病変の成層化の確立したモデルを超えず、放射線を現在の最先端として残した。
参考スコア（独自算出の注目度）: 2.051956285018868
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid proliferation of open-source medical foundation models (FMs) raises a practical question: how well do their pre-trained representations transfer to clinically relevant but data-scarce classification tasks? Particularly in CT-based renal lesion classification, a push toward greater generalizability would be meaningful, as the field is constrained by inherently limited training data. We addressed this through a benchmark of three medical FMs on this specific task. This six-class problem spans common entities like cysts and clear cell renal cell carcinoma, alongside rare subtypes. Using a frozen feature-probing protocol, we compared FM embeddings against a handcrafted radiomics classifier and a 3D ResNet-50 trained from scratch. Models were trained on a composite dataset of 2,854 lesions and evaluated on an external test set of 234 lesions from The Cancer Imaging Archive. Our results reveal two key findings. First, FM performance (AUC 0.70-0.77) matched the from-scratch ResNet (AUC 0.72) while drastically reducing hardware demand, requiring only seconds on a CPU after feature extraction. However, the conventional radiomics baseline significantly outperformed all deep learning approaches, achieving an AUC of 0.88 (all p $\leq$ 0.002). This suggests that current generalist FM embeddings do not yet capture the fine-grained texture and shape heterogeneity driving histological subtype discrimination. Despite their potential in data-scarce settings, medical FMs did not surpass established models for renal lesion stratification, leaving radiomics as the current state-of-the-art.
Abstract（参考訳）: オープンソース医療基盤モデル(FM)の急激な普及は、その事前訓練された表現が、臨床的に関連があるがデータスカースな分類タスクにどの程度うまく移行できるかという、実践的な疑問を提起する。特にCTベースの腎病変分類では、フィールドは本質的に限られた訓練データによって制限されるため、より一般化性を高めることが有意義である。我々は、この特定のタスクに関する3つの医療用FMのベンチマークを通じて、この問題に対処した。この6種類の問題は、まれなサブタイプとともに、嚢胞や透明な細胞腎細胞癌のような一般的なエンティティにまたがる。凍結した特徴探索プロトコルを用いて,手作りラジオミクス分類器と3D ResNet-50のスクラッチから訓練したFM埋め込みを比較した。モデルは2,854病変の複合データセットで訓練され、The Cancer Imaging Archiveから234病変の外部テストセットで評価された。以上の結果から2つの重要な結果が得られた。まず、FM性能(AUC 0.70-0.77)は、オフスクラッチのResNet(AUC 0.72)と一致し、ハードウェアの需要を大幅に減らした。しかし、従来の放射能のベースラインは深層学習のアプローチを著しく上回り、AUCは0.88(全てp$\leq$0.002)に達した。これは、現在の一般FM埋め込みが、組織学的サブタイプ識別を駆動する微細なテクスチャや形状の不均一性をまだ捉えていないことを示唆している。データスカース設定の可能性にもかかわらず、医療用FMは腎病変の成層化の確立したモデルを超えておらず、現在の最先端技術として放射線が残されている。

論文の概要: Benchmarking Foundation Models for Renal Lesion Stratification in CT

関連論文リスト