Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM
- URL: http://arxiv.org/abs/2501.16481v1
- Date: Mon, 27 Jan 2025 20:28:01 GMT
- Title: Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM
- Authors: Payal Kamboj, Ayan Banerjee, Bin Xu, Sandeep Gupta,
- Abstract summary: This paper introduces a simple yet effective method for generating highly accurate and contextually descriptive prompts.
We propose a novel approach that uses domain-specific expert knowledge on rare events to generate customized and contextually relevant prompts.
Our method enhances rare event classification without additional training, outperforming state-of-the-art techniques.
- Score: 7.133750565011626
- License:
- Abstract: Rare events, due to their infrequent occurrences, do not have much data, and hence deep learning techniques fail in estimating the distribution for such data. Open-vocabulary models represent an innovative approach to image classification. Unlike traditional models, these models classify images into any set of categories specified with natural language prompts during inference. These prompts usually comprise manually crafted templates (e.g., 'a photo of a {}') that are filled in with the names of each category. This paper introduces a simple yet effective method for generating highly accurate and contextually descriptive prompts containing discriminative characteristics. Rare event detection, especially in medicine, is more challenging due to low inter-class and high intra-class variability. To address these, we propose a novel approach that uses domain-specific expert knowledge on rare events to generate customized and contextually relevant prompts, which are then used by large language models for image classification. Our zero-shot, privacy-preserving method enhances rare event classification without additional training, outperforming state-of-the-art techniques.
Related papers
- Natural Language Induced Adversarial Images [14.415478695871604]
We propose a natural language induced adversarial image attack method.
The core idea is to leverage a text-to-image model to generate adversarial images given input prompts.
In experiments, we found that some high-frequency semantic information such as "foggy", "humid", "stretching" can easily cause errors.
arXiv Detail & Related papers (2024-10-11T08:36:07Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Improving face generation quality and prompt following with synthetic captions [57.47448046728439]
We introduce a training-free pipeline designed to generate accurate appearance descriptions from images of people.
We then use these synthetic captions to fine-tune a text-to-image diffusion model.
Our results demonstrate that this approach significantly improves the model's ability to generate high-quality, realistic human faces.
arXiv Detail & Related papers (2024-05-17T15:50:53Z) - Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation [60.943159830780154]
We introduce Bounded Attention, a training-free method for bounding the information flow in the sampling process.
We demonstrate that our method empowers the generation of multiple subjects that better align with given prompts and layouts.
arXiv Detail & Related papers (2024-03-25T17:52:07Z) - SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with
Large Language Models [56.88192537044364]
We propose a simple-yet-effective parameter-efficient fine-tuning approach called the Semantic Understanding and Reasoning adapter (SUR-adapter) for pre-trained diffusion models.
Our approach can make text-to-image diffusion models easier to use with better user experience.
arXiv Detail & Related papers (2023-05-09T05:48:38Z) - Discriminative Class Tokens for Text-to-Image Diffusion Models [102.88033622546251]
We propose a non-invasive fine-tuning technique that capitalizes on the expressive potential of free-form text.
Our method is fast compared to prior fine-tuning methods and does not require a collection of in-class images.
We evaluate our method extensively, showing that the generated images are: (i) more accurate and of higher quality than standard diffusion models, (ii) can be used to augment training data in a low-resource setting, and (iii) reveal information about the data used to train the guiding classifier.
arXiv Detail & Related papers (2023-03-30T05:25:20Z) - What does a platypus look like? Generating customized prompts for
zero-shot image classification [52.92839995002636]
This work introduces a simple method to generate higher accuracy prompts without relying on any explicit knowledge of the task domain.
We leverage the knowledge contained in large language models (LLMs) to generate many descriptive sentences that contain important discriminating characteristics of the image categories.
This approach improves accuracy on a range of zero-shot image classification benchmarks, including over one percentage point gain on ImageNet.
arXiv Detail & Related papers (2022-09-07T17:27:08Z) - Few-Shot Hyperspectral Image Classification With Unknown Classes Using
Multitask Deep Learning [24.02524697784525]
Current hyperspectral image classification assumes that a predefined classification system is closed and complete.
We propose a deep learning method that simultaneously conducts classification and reconstruction in the open world.
Our method achieved more accurate hyperspectral image classification, especially under the few-shot context.
arXiv Detail & Related papers (2020-09-08T03:53:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.