Fugu-MT 論文翻訳(概要): Automatic Report Generation for Histopathology images using pre-trained Vision Transformers and BERT

論文の概要: Automatic Report Generation for Histopathology images using pre-trained Vision Transformers and BERT

arxiv url: http://arxiv.org/abs/2312.01435v1
Date: Sun, 3 Dec 2023 15:56:09 GMT
ステータス: 翻訳完了
システム内更新日: 2023-12-05 17:19:17.846300
Title: Automatic Report Generation for Histopathology images using pre-trained Vision Transformers and BERT
Title（参考訳）: 術前視覚トランスフォーマーとbertを用いた病理組織像の自動レポート生成
Authors: Saurav Sengupta, Donald E. Brown
Abstract要約: まず,既存の事前学習型ビジョントランスフォーマーを用いて4096x4096サイズのWSIのパッチを符号化し,それをエンコーダおよびBiBERTモデルとして,レポート生成に使用することを示す。本手法は,画像を記述するキャプションの生成と評価だけでなく,画像の組織型や患者の性別の分類にも有効である。
参考スコア（独自算出の注目度）: 1.2781698000674653
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning for histopathology has been successfully used for disease classification, image segmentation and more. However, combining image and text modalities using current state-of-the-art methods has been a challenge due to the high resolution of histopathology images. Automatic report generation for histopathology images is one such challenge. In this work, we show that using an existing pre-trained Vision Transformer in a two-step process of first using it to encode 4096x4096 sized patches of the Whole Slide Image (WSI) and then using it as the encoder and a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model for language modeling-based decoder for report generation, we can build a fairly performant and portable report generation mechanism that takes into account the whole of the high resolution image, instead of just the patches. Our method allows us to not only generate and evaluate captions that describe the image, but also helps us classify the image into tissue types and the gender of the patient as well. Our best performing model achieves a 79.98% accuracy in Tissue Type classification and 66.36% accuracy in classifying the sex of the patient the tissue came from, with a BLEU-4 score of 0.5818 in our caption generation task.
Abstract（参考訳）: 病理組織学の深層学習は、疾患の分類、画像分割などに有効である。しかし,病理組織像の高分解能化により,最先端の手法による画像とテキストの融合が課題となっている。病理画像の自動レポート生成はそのような課題である。 In this work, we show that using an existing pre-trained Vision Transformer in a two-step process of first using it to encode 4096x4096 sized patches of the Whole Slide Image (WSI) and then using it as the encoder and a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model for language modeling-based decoder for report generation, we can build a fairly performant and portable report generation mechanism that takes into account the whole of the high resolution image, instead of just the patches. 本手法は,画像の特徴を記述したキャプションを生成・評価するだけでなく,その画像を組織型や患者の性別に分類する上でも有効である。我々のベストパフォーマンスモデルは、組織型分類における79.98%の正確さと、組織由来の患者の性別分類における66.36%の正確さを達成し、我々のキャプション生成タスクにおけるbleu-4スコアは0.5818である。

関連論文リスト

GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning [1.25828876338076]
コンピュータ支援病理学においてWSI分類とキャプションが重要な課題となっている。病理画像からの分類とキャプション生成のための新しいGNN-ViTCapフレームワークを提案する。 GNN-ViTCapのF1スコアは0.934、AUCは0.963、BLEU-4スコアは0.811、METEORスコアは0.569である。
論文参考訳（メタデータ） (2025-07-09T16:35:21Z)
PixCell: A generative foundation model for digital histopathology images [49.00921097924924]
PixCellは,病理組織学における最初の拡散ベース生成基盤モデルである。われわれはPanCan-30MでPixCellをトレーニングした。
論文参考訳（メタデータ） (2025-06-05T15:14:32Z)
Automated Report Generation for Lung Cytological Images Using a CNN Vision Classifier and Multiple-Transformer Text Decoders: Preliminary Study [0.0]
感度と特異度はそれぞれ100%,96.4%であった。生成されたテキストの文法とスタイルは、金本位とよりよく一致して正しいものと確認された。
論文参考訳（メタデータ） (2024-03-26T23:32:29Z)
WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images [5.960501267687475]
スライド画像全体から病理報告を生成する方法について検討する。私たちは、最大のWSIテキストデータセット(PathText)をキュレートしました。モデル終端では、多重インスタンス生成モデル(MI-Gen)を提案する。
論文参考訳（メタデータ） (2023-11-27T05:05:41Z)
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers [1.2781698000674653]
既存の事前学習型視覚変換器を用いて4096x4096サイズのパッチをWSI(Whole Slide Image)にエンコードし、それをエンコーダおよびLSTMデコーダとしてレポート生成に使用することを示す。また、既存の強力な訓練済み階層型視覚変換器の表現を使用でき、ゼロショット分類だけでなく、レポート生成にも有用であることを示すことができる。
論文参考訳（メタデータ） (2023-11-10T16:48:24Z)
PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
そこで我々は,高品質な病理像を生成するためのテキスト条件付き遅延拡散モデルPathLDMを紹介した。提案手法は画像とテキストデータを融合して生成プロセスを強化する。我々は,TCGA-BRCAデータセット上でのテキスト・ツー・イメージ生成において,SoTA FIDスコア7.64を達成し,FID30.1と最も近いテキスト・コンディショナブル・コンペティタを著しく上回った。
論文参考訳（メタデータ） (2023-09-01T22:08:32Z)
Customizing General-Purpose Foundation Models for Medical Report Generation [64.31265734687182]
ラベル付き医用画像-レポートペアの不足は、ディープニューラルネットワークや大規模ニューラルネットワークの開発において大きな課題となっている。本稿では,コンピュータビジョンと自然言語処理の基盤モデル (FM) として,市販の汎用大規模事前学習モデルのカスタマイズを提案する。
論文参考訳（メタデータ） (2023-06-09T03:02:36Z)
Cross-modulated Few-shot Image Generation for Colorectal Tissue Classification [58.147396879490124]
XM-GANと名づけられた少数ショット生成法は,1塩基と1対の参照組織像を入力とし,高品質で多様な画像を生成する。我々の知る限りでは、大腸組織像の少数ショット生成を最初に調査した人物である。
論文参考訳（メタデータ） (2023-04-04T17:50:30Z)
AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
本稿では,スライド病理像全体に対する共有コンテキスト処理の新たな概念を提案する。 AMIGOは、組織内のセルラーグラフを使用して、患者に単一の表現を提供する。我々のモデルは、データの20%以下で同じ性能を達成できる程度に、欠落した情報に対して強い堅牢性を示す。
論文参考訳（メタデータ） (2023-03-01T23:37:45Z)
StraIT: Non-autoregressive Generation with Stratified Image Transformer [63.158996766036736]
Stratified Image Transformer(StraIT)は、純粋な非自己回帰(NAR)生成モデルである。実験の結果,StraIT は NAR 生成を著しく改善し,既存の DM および AR 手法より優れていた。
論文参考訳（メタデータ） (2023-03-01T18:59:33Z)
DEPAS: De-novo Pathology Semantic Masks using a Generative Model [0.0]
DEPASと呼ばれるスケーラブルな生成モデルを導入し、組織構造をキャプチャし、最先端の品質の高精細なセマンティックマスクを生成する。我々は,DEPASが皮膚,前立腺,肺の3種類の臓器に対して,組織の現実的な意味マップを生成する能力を示した。
論文参考訳（メタデータ） (2023-02-13T16:48:33Z)
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing [53.89917396428747]
視覚言語処理における自己教師あり学習は、画像とテキストのモダリティのセマンティックアライメントを利用する。トレーニングと微調整の両方で利用できる場合、事前のイメージとレポートを明示的に説明します。我々のアプローチはBioViL-Tと呼ばれ、テキストモデルと共同で訓練されたCNN-Transformerハイブリッドマルチイメージエンコーダを使用する。
論文参考訳（メタデータ） (2023-01-11T16:35:33Z)
Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology [5.164102666113966]
我々は、様々な弱い教師付きおよびパッチレベルのタスクに対する検証を行い、様々な自己教師付きモデルを訓練することにより、病理学における良い表現を探索する。我々の重要な発見は、DINOベースの知識蒸留を用いたビジョントランスフォーマーが、組織像におけるデータ効率と解釈可能な特徴を学習できることを発見したことである。
論文参考訳（メタデータ） (2022-03-01T16:14:41Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。