AFiRe: Anatomy-Driven Self-Supervised Learning for Fine-Grained Representation in Radiographic Images
- URL: http://arxiv.org/abs/2504.10972v2
- Date: Tue, 22 Apr 2025 08:38:33 GMT
- Title: AFiRe: Anatomy-Driven Self-Supervised Learning for Fine-Grained Representation in Radiographic Images
- Authors: Yihang Liu, Lianghua He, Ying Wen, Longzhen Yang, Hongzhou Chen,
- Abstract summary: We propose an Anatomy-driven self-supervised framework for enhancing Fine-grained Representation in radiographic image analysis (AFiRe)<n>The core idea of AFiRe is to align the anatomical consistency with the unique token-processing characteristics of Vision Transformer.<n> Experimental results show that AFiRe provides robust anatomical discrimination, achieving more cohesive feature clusters compared to state-of-the-art contrastive learning methods.
- Score: 7.647881928269929
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current self-supervised methods, such as contrastive learning, predominantly focus on global discrimination, neglecting the critical fine-grained anatomical details required for accurate radiographic analysis. To address this challenge, we propose an Anatomy-driven self-supervised framework for enhancing Fine-grained Representation in radiographic image analysis (AFiRe). The core idea of AFiRe is to align the anatomical consistency with the unique token-processing characteristics of Vision Transformer. Specifically, AFiRe synergistically performs two self-supervised schemes: (i) Token-wise anatomy-guided contrastive learning, which aligns image tokens based on structural and categorical consistency, thereby enhancing fine-grained spatial-anatomical discrimination; (ii) Pixel-level anomaly-removal restoration, which particularly focuses on local anomalies, thereby refining the learned discrimination with detailed geometrical information. Additionally, we propose Synthetic Lesion Mask to enhance anatomical diversity while preserving intra-consistency, which is typically corrupted by traditional data augmentations, such as Cropping and Affine transformations. Experimental results show that AFiRe: (i) provides robust anatomical discrimination, achieving more cohesive feature clusters compared to state-of-the-art contrastive learning methods; (ii) demonstrates superior generalization, surpassing 7 radiography-specific self-supervised methods in multi-label classification tasks with limited labeling; and (iii) integrates fine-grained information, enabling precise anomaly detection using only image-level annotations.
Related papers
- A Semi-Supervised Approach with Error Reflection for Echocardiography Segmentation [21.72866654935505]
We propose an error reflection strategy for echocardiography semi-supervised segmentation architecture.
The strategy triggers the model to reflect on inaccuracies in unlabeled image segmentation, thereby enhancing the robustness of pseudo-label generation.
We also introduce an effective data augmentation strategy, termed as multi-scale mixing up strategy, to minimize the empirical distribution gap between labeled and unlabeled images.
arXiv Detail & Related papers (2024-12-01T07:35:09Z) - Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection [58.87142367781417]
A naively trained detector tends to favor overfitting to the limited and monotonous fake patterns, causing the feature space to become highly constrained and low-ranked.<n>One potential remedy is incorporating the pre-trained knowledge within the vision foundation models to expand the feature space.<n>By freezing the principal components and adapting only the remained components, we preserve the pre-trained knowledge while learning forgery-related patterns.
arXiv Detail & Related papers (2024-11-23T19:10:32Z) - Advancing Medical Image Segmentation: Morphology-Driven Learning with Diffusion Transformer [4.672688418357066]
We propose a novel Transformer Diffusion (DTS) model for robust segmentation in the presence of noise.
Our model, which analyzes the morphological representation of images, shows better results than the previous models in various medical imaging modalities.
arXiv Detail & Related papers (2024-08-01T07:35:54Z) - Introducing Shape Prior Module in Diffusion Model for Medical Image
Segmentation [7.7545714516743045]
We propose an end-to-end framework called VerseDiff-UNet, which leverages the denoising diffusion probabilistic model (DDPM)
Our approach integrates the diffusion model into a standard U-shaped architecture.
We evaluate our method on a single dataset of spine images acquired through X-ray imaging.
arXiv Detail & Related papers (2023-09-12T03:05:00Z) - Employing similarity to highlight differences: On the impact of
anatomical assumptions in chest X-ray registration methods [2.080328156648695]
We develop an anatomically penalized convolutional multi-stage solution on the National Institutes of Health (NIH) data set.
Our method proves to be a natural way to limit the folding percentage of the warp field to 1/6 of the state of the art.
We statistically evaluate the benefits of our method and highlight the limits of currently used metrics for registration of chest X-rays.
arXiv Detail & Related papers (2023-01-23T09:42:49Z) - GraVIS: Grouping Augmented Views from Independent Sources for
Dermatology Analysis [52.04899592688968]
We propose GraVIS, which is specifically optimized for learning self-supervised features from dermatology images.
GraVIS significantly outperforms its transfer learning and self-supervised learning counterparts in both lesion segmentation and disease classification tasks.
arXiv Detail & Related papers (2023-01-11T11:38:37Z) - Improving Radiology Summarization with Radiograph and Anatomy Prompts [60.30659124918211]
We propose a novel anatomy-enhanced multimodal model to promote impression generation.
In detail, we first construct a set of rules to extract anatomies and put these prompts into each sentence to highlight anatomy characteristics.
We utilize a contrastive learning module to align these two representations at the overall level and use a co-attention to fuse them at the sentence level.
arXiv Detail & Related papers (2022-10-15T14:05:03Z) - Mine yOur owN Anatomy: Revisiting Medical Image Segmentation with Extremely Limited Labels [54.58539616385138]
We introduce a novel semi-supervised 2D medical image segmentation framework termed Mine yOur owN Anatomy (MONA)
First, prior work argues that every pixel equally matters to the model training; we observe empirically that this alone is unlikely to define meaningful anatomical features.
Second, we construct a set of objectives that encourage the model to be capable of decomposing medical images into a collection of anatomical features.
arXiv Detail & Related papers (2022-09-27T15:50:31Z) - SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection [76.01333073259677]
We propose the use of Space-aware Memory Queues for In-painting and Detecting anomalies from radiography images (abbreviated as SQUID)
We show that SQUID can taxonomize the ingrained anatomical structures into recurrent patterns; and in the inference, it can identify anomalies (unseen/modified patterns) in the image.
arXiv Detail & Related papers (2021-11-26T13:47:34Z) - Cross Chest Graph for Disease Diagnosis with Structural Relational
Reasoning [2.7148274921314615]
Locating lesions is important in the computer-aided diagnosis of X-ray images.
General weakly-supervised methods have failed to consider the characteristics of X-ray images.
We propose the Cross-chest Graph (CCG), which improves the performance of automatic lesion detection.
arXiv Detail & Related papers (2021-01-22T08:24:04Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z) - Multi-label Thoracic Disease Image Classification with Cross-Attention
Networks [65.37531731899837]
We propose a novel scheme of Cross-Attention Networks (CAN) for automated thoracic disease classification from chest x-ray images.
We also design a new loss function that beyond cross-entropy loss to help cross-attention process and is able to overcome the imbalance between classes and easy-dominated samples within each class.
arXiv Detail & Related papers (2020-07-21T14:37:00Z) - Learning to Segment Anatomical Structures Accurately from One Exemplar [34.287877547953194]
Methods that permit to produce accurate anatomical structure segmentation without using a large amount of fully annotated training images are highly desirable.
We propose Contour Transformer Network (CTN), a one-shot anatomy segmentor including a naturally built-in human-in-the-loop mechanism.
We demonstrate that our one-shot learning method significantly outperforms non-learning-based methods and performs competitively to the state-of-the-art fully supervised deep learning approaches.
arXiv Detail & Related papers (2020-07-06T20:27:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.