In-context learning enables multimodal large language models to classify
cancer pathology images
- URL: http://arxiv.org/abs/2403.07407v1
- Date: Tue, 12 Mar 2024 08:34:34 GMT
- Title: In-context learning enables multimodal large language models to classify
cancer pathology images
- Authors: Dyke Ferber, Georg W\"olflein, Isabella C. Wiest, Marta Ligero,
Srividhya Sainath, Narmin Ghaffari Laleh, Omar S.M. El Nahhas, Gustav
M\"uller-Franzes, Dirk J\"ager, Daniel Truhn, Jakob Nikolas Kather
- Abstract summary: In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates.
Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning.
Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples.
- Score: 0.7085801706650957
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Medical image classification requires labeled, task-specific datasets which
are used to train deep learning networks de novo, or to fine-tune foundation
models. However, this process is computationally and technically demanding. In
language processing, in-context learning provides an alternative, where models
learn from within prompts, bypassing the need for parameter updates. Yet,
in-context learning remains underexplored in medical image analysis. Here, we
systematically evaluate the model Generative Pretrained Transformer 4 with
Vision capabilities (GPT-4V) on cancer image processing with in-context
learning on three cancer histopathology tasks of high importance:
Classification of tissue subtypes in colorectal cancer, colon polyp subtyping
and breast tumor detection in lymph node sections. Our results show that
in-context learning is sufficient to match or even outperform specialized
neural networks trained for particular tasks, while only requiring a minimal
number of samples. In summary, this study demonstrates that large vision
language models trained on non-domain specific data can be applied out-of-the
box to solve medical image-processing tasks in histopathology. This
democratizes access of generalist AI models to medical experts without
technical background especially for areas where annotated data is scarce.
Related papers
- FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography [17.788748860485438]
Uninterpretable deep learning models are unsuitable in high-stakes environments.
Recent work in interpretable computer vision provides transparency to these formerly black boxes.
This paper proposes a novel multi-scale interpretable deep learning model for mammographic mass margin classification.
arXiv Detail & Related papers (2024-06-10T15:44:41Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Towards a Visual-Language Foundation Model for Computational Pathology [5.72536252929528]
We introduce CONtrastive learning from Captions for Histopathology (CONCH)
CONCH is a visual-language foundation model developed using diverse sources of histopathology images, biomedical text, and task-agnostic pretraining.
It is evaluated on a suite of 13 diverse benchmarks, achieving state-of-the-art performance on histology image classification, segmentation, captioning, text-to-image and image-to-text retrieval.
arXiv Detail & Related papers (2023-07-24T16:13:43Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - PathologyBERT -- Pre-trained Vs. A New Transformer Language Model for
Pathology Domain [2.3628956573813498]
Successful text mining of a large pathology database can play a critical role to advance 'big data' cancer research.
No pathology-specific language space exist to support the rapid data-mining development in pathology space.
PathologyBERT is a pre-trained masked language model which was trained on 347,173 histopathology specimen reports.
arXiv Detail & Related papers (2022-05-13T20:42:07Z) - Intelligent Masking: Deep Q-Learning for Context Encoding in Medical
Image Analysis [48.02011627390706]
We develop a novel self-supervised approach that occludes targeted regions to improve the pre-training procedure.
We show that training the agent against the prediction model can significantly improve the semantic features extracted for downstream classification tasks.
arXiv Detail & Related papers (2022-03-25T19:05:06Z) - IAIA-BL: A Case-based Interpretable Deep Learning Model for
Classification of Mass Lesions in Digital Mammography [20.665935997959025]
Interpretability in machine learning models is important in high-stakes decisions.
We present a framework for interpretable machine learning-based mammography.
arXiv Detail & Related papers (2021-03-23T05:00:21Z) - Domain Generalization for Medical Imaging Classification with
Linear-Dependency Regularization [59.5104563755095]
We introduce a simple but effective approach to improve the generalization capability of deep neural networks in the field of medical imaging classification.
Motivated by the observation that the domain variability of the medical images is to some extent compact, we propose to learn a representative feature space through variational encoding.
arXiv Detail & Related papers (2020-09-27T12:30:30Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.