OFF-CLIP: Improving Normal Detection Confidence in Radiology CLIP with Simple Off-Diagonal Term Auto-Adjustment
- URL: http://arxiv.org/abs/2503.01794v1
- Date: Mon, 03 Mar 2025 18:24:11 GMT
- Title: OFF-CLIP: Improving Normal Detection Confidence in Radiology CLIP with Simple Off-Diagonal Term Auto-Adjustment
- Authors: Junhyun Park, Chanyu Moon, Donghwan Lee, Kyungsu Kim, Minho Hwang,
- Abstract summary: We propose OFF-CLIP, a contrastive learning refinement that improves normal detection.<n> OFF-CLIP can be applied to radiology CLIP models without requiring any architectural modifications.
- Score: 6.085134938844728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive Language-Image Pre-Training (CLIP) has enabled zero-shot classification in radiology, reducing reliance on manual annotations. However, conventional contrastive learning struggles with normal case detection due to its strict intra-sample alignment, which disrupts normal sample clustering and leads to high false positives (FPs) and false negatives (FNs). To address these issues, we propose OFF-CLIP, a contrastive learning refinement that improves normal detection by introducing an off-diagonal term loss to enhance normal sample clustering and applying sentence-level text filtering to mitigate FNs by removing misaligned normal statements from abnormal reports. OFF-CLIP can be applied to radiology CLIP models without requiring any architectural modifications. Experimental results show that OFF-CLIP significantly improves normal classification, achieving a 0.61 Area under the curve (AUC) increase on VinDr-CXR over CARZero, the state-of-the-art zero-shot classification baseline, while maintaining or improving abnormal classification performance. Additionally, OFF-CLIP enhances zero-shot grounding by improving pointing game accuracy, confirming better anomaly localization. These results demonstrate OFF-CLIP's effectiveness as a robust and efficient enhancement for medical vision-language models.
Related papers
- Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections [50.343419243749054]
Anomaly Detection (AD) involves identifying deviations from normal data distributions.
We propose a novel approach that conditions the prompts of the text encoder based on image context extracted from the vision encoder.
Our method achieves state-of-the-art performance, improving performance by 2% to 29% across different metrics on 14 datasets.
arXiv Detail & Related papers (2025-04-15T10:42:25Z) - AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIP [33.213400694016]
Anomaly detection (AD) identifies outliers for applications like defect and lesion detection.
We propose Anomaly-Aware CLIP (AA-CLIP), which enhances CLIP's anomaly discrimination ability in both text and visual spaces.
AA-CLIP is achieved through a straightforward yet effective two-stage approach.
arXiv Detail & Related papers (2025-03-09T15:22:52Z) - Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection [88.34095233600719]
FAPrompt is a novel framework designed to learn Fine-grained Abnormality Prompts for more accurate ZSAD.
It substantially outperforms state-of-the-art methods by at least 3%-5% AUC/AP in both image- and pixel-level ZSAD tasks.
arXiv Detail & Related papers (2024-10-14T08:41:31Z) - Robust Calibration of Large Vision-Language Adapters [17.583536041845402]
This paper addresses the critical issue of miscalibration in CLIP-based model adaptation.
We empirically demonstrate that popular CLIP adaptation approaches, such as Adapters, Prompt Learning, and Test-Time Adaptation, substantially degrade the calibration capabilities of the zero-shot baseline.
Motivated by these observations, we present a simple and model-agnostic solution to mitigate miscalibration, by scaling the logit range of each sample to its zero-shot prediction logits.
arXiv Detail & Related papers (2024-07-18T15:27:56Z) - On Temperature Scaling and Conformal Prediction of Deep Classifiers [9.975341265604577]
Conformal Prediction (CP) produces a prediction set of candidate labels that contains the true label with a user-specified probability.<n>In practice, both types of indications are desirable, yet, so far the interplay between them has not been investigated.<n>We show that while Temperature Scaling (TS) calibration improves the class-conditional coverage of adaptive CP methods, surprisingly, it negatively affects their prediction set sizes.
arXiv Detail & Related papers (2024-02-08T16:45:12Z) - Bootstrap Fine-Grained Vision-Language Alignment for Unified Zero-Shot
Anomaly Localization [63.61093388441298]
Contrastive Language-Image Pre-training models have shown promising performance on zero-shot visual recognition tasks.
In this work, we propose AnoCLIP for zero-shot anomaly localization.
arXiv Detail & Related papers (2023-08-30T10:35:36Z) - Enabling Calibration In The Zero-Shot Inference of Large Vision-Language
Models [58.720142291102135]
We measure calibration across relevant variables like prompt, dataset, and architecture, and find that zero-shot inference with CLIP is miscalibrated.
A single learned temperature generalizes for each specific CLIP model across inference dataset and prompt choice.
arXiv Detail & Related papers (2023-03-11T17:14:04Z) - A Benchmark for Weakly Semi-Supervised Abnormality Localization in Chest
X-Rays [42.1336336144291]
We propose to train the CXR abnormality localization framework via a weakly semi-supervised strategy, termed Point Beyond Class.
The core idea behind our PBC is to learn a robust and accurate mapping from the point annotations to the bounding boxes.
Experimental results on RSNA and VinDr-CXR datasets justify the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-09-05T14:36:07Z) - Hierarchical Semi-Supervised Contrastive Learning for
Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution.
Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies.
We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z) - Simple Adaptive Projection with Pretrained Features for Anomaly
Detection [0.0]
We propose a novel adaptation framework including simple linear transformation and self-attention.
Our simple adaptive projection with pretrained features(SAP2) yields a novel anomaly detection criterion.
arXiv Detail & Related papers (2021-12-05T15:29:59Z) - CASTLE: Regularization via Auxiliary Causal Graph Discovery [89.74800176981842]
We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables.
CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.
arXiv Detail & Related papers (2020-09-28T09:49:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.