CheXLearner: Text-Guided Fine-Grained Representation Learning for Progression Detection
- URL: http://arxiv.org/abs/2505.06903v1
- Date: Sun, 11 May 2025 08:51:38 GMT
- Title: CheXLearner: Text-Guided Fine-Grained Representation Learning for Progression Detection
- Authors: Yuanzhuo Wang, Junwen Duan, Xinyu Li, Jianxin Wang,
- Abstract summary: We present CheXLearner, the first end-to-end framework that unifies anatomical region detection, structure alignment, and semantic guidance.<n>Our proposed Med-Manifold Alignment Module (Med-MAM) leverages hyperbolic geometry to robustly align anatomical structures.<n>Our model attains a 91.52% average AUC score in downstream disease classification, validating its superior feature representation.
- Score: 14.414457048968439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temporal medical image analysis is essential for clinical decision-making, yet existing methods either align images and text at a coarse level - causing potential semantic mismatches - or depend solely on visual information, lacking medical semantic integration. We present CheXLearner, the first end-to-end framework that unifies anatomical region detection, Riemannian manifold-based structure alignment, and fine-grained regional semantic guidance. Our proposed Med-Manifold Alignment Module (Med-MAM) leverages hyperbolic geometry to robustly align anatomical structures and capture pathologically meaningful discrepancies across temporal chest X-rays. By introducing regional progression descriptions as supervision, CheXLearner achieves enhanced cross-modal representation learning and supports dynamic low-level feature optimization. Experiments show that CheXLearner achieves 81.12% (+17.2%) average accuracy and 80.32% (+11.05%) F1-score on anatomical region progression detection - substantially outperforming state-of-the-art baselines, especially in structurally complex regions. Additionally, our model attains a 91.52% average AUC score in downstream disease classification, validating its superior feature representation.
Related papers
- RegionMed-CLIP: A Region-Aware Multimodal Contrastive Learning Pre-trained Model for Medical Image Understanding [0.0]
RegionMed-CLIP is a multimodal contrastive learning framework that incorporates localized pathological signals along with holistic semantic representations.<n>We construct MedRegion-500k, a comprehensive medical image-text corpus that features extensive regional annotations and multilevel clinical descriptions.<n>Our results highlight the critical importance of region-aware contrastive pre-training and position RegionMed-CLIP as a robust foundation for advancing multimodal medical image understanding.
arXiv Detail & Related papers (2025-08-07T10:32:03Z) - GRASPing Anatomy to Improve Pathology Segmentation [67.98147643529309]
We introduce GRASP, a modular plug-and-play framework that enhances pathology segmentation models.<n>We evaluate GRASP on two PET/CT datasets, conduct systematic ablation studies, and investigate the framework's inner workings.
arXiv Detail & Related papers (2025-08-05T12:26:36Z) - RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z) - CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays [9.051771615770075]
We present CheXStruct and CXReasonBench, a structured pipeline and benchmark built on the publicly available MIMIC-CXR-JPG dataset.<n>CheXStruct automatically derives a sequence of intermediate reasoning steps directly from chest X-rays.<n> CXReasonBench leverages this pipeline to evaluate whether models can perform clinically valid reasoning steps.
arXiv Detail & Related papers (2025-05-23T16:44:21Z) - Myocardial Region-guided Feature Aggregation Net for Automatic Coronary artery Segmentation and Stenosis Assessment using Coronary Computed Tomography Angiography [13.885760158090692]
Myocardial Region-guided Feature Aggregation Net is a novel U-shaped dual-encoder architecture that integrates anatomical prior knowledge to enhance robustness in coronary artery segmentation.<n>Our framework incorporates three key innovations: (1) a Myocardial Region-guided Module that directs attention to coronary regions via bridging expansion and multi-scale feature fusion, (2) a Residual Feature Extraction Module that combines parallel spatial channel attention with residual blocks to enhance local-global feature discrimination, and (3) a Multi-scale Feature Fusion Module for adaptive aggregation of hierarchical vascular features.
arXiv Detail & Related papers (2025-04-27T16:43:52Z) - Multi-Class Segmentation of Aortic Branches and Zones in Computed Tomography Angiography: The AortaSeg24 Challenge [55.252714550918824]
AortaSeg24 MICCAI Challenge introduced the first dataset of 100 CTA volumes annotated for 23 clinically relevant aortic branches and zones.<n>This paper presents the challenge design, dataset details, evaluation metrics, and an in-depth analysis of the top-performing algorithms.
arXiv Detail & Related papers (2025-02-07T21:09:05Z) - Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models [3.2090645669282045]
We introduce FDDM, a novel framework bridging the gap between patch classification and region-based segmentation.<n>FDDM operates in two stages: patch-based classification, followed by region-based refinement, enabling cross-patch information intergation.<n>This framework sets a new benchmark in osteosarcoma assessment, highlighting the potential of foundation models and diffusion-based refinements.
arXiv Detail & Related papers (2025-01-03T18:06:18Z) - MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation Learning [6.4136876268620115]
MLVICX is an approach to capture rich representations in the form of embeddings from chest X-ray images.
We demonstrate the performance of MLVICX in advancing self-supervised chest X-ray representation learning.
arXiv Detail & Related papers (2024-03-18T06:19:37Z) - Class Attention to Regions of Lesion for Imbalanced Medical Image
Recognition [59.28732531600606]
We propose a framework named textbfClass textbfAttention to textbfREgions of the lesion (CARE) to handle data imbalance issues.
The CARE framework needs bounding boxes to represent the lesion regions of rare diseases.
Results show that the CARE variants with automated bounding box generation are comparable to the original CARE framework.
arXiv Detail & Related papers (2023-07-19T15:19:02Z) - XVertNet: Unsupervised Contrast Enhancement of Vertebral Structures with Dynamic Self-Tuning Guidance and Multi-Stage Analysis [1.3584858315758948]
Chest X-rays remain the primary diagnostic tool in emergency medicine, yet their limited ability to capture fine anatomical details can result in missed or delayed diagnoses.<n>We introduce XVertNet, a novel deep-learning framework designed to enhance vertebral structure visualization in X-ray images significantly.
arXiv Detail & Related papers (2023-06-06T19:36:11Z) - Orientation-Shared Convolution Representation for CT Metal Artifact
Learning [63.67718355820655]
During X-ray computed tomography (CT) scanning, metallic implants carrying with patients often lead to adverse artifacts.
Existing deep-learning-based methods have gained promising reconstruction performance.
We propose an orientation-shared convolution representation strategy to adapt the physical prior structures of artifacts.
arXiv Detail & Related papers (2022-12-26T13:56:12Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z) - SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection [76.01333073259677]
We propose the use of Space-aware Memory Queues for In-painting and Detecting anomalies from radiography images (abbreviated as SQUID)
We show that SQUID can taxonomize the ingrained anatomical structures into recurrent patterns; and in the inference, it can identify anomalies (unseen/modified patterns) in the image.
arXiv Detail & Related papers (2021-11-26T13:47:34Z) - Anatomy X-Net: A Semi-Supervised Anatomy Aware Convolutional Neural
Network for Thoracic Disease Classification [3.888080947524813]
This work proposes an anatomy-aware attention-based architecture named Anatomy X-Net.
It prioritizes the spatial features guided by the pre-identified anatomy regions.
Our proposed method sets new state-of-the-art performance on the official NIH test set with an AUC score of 0.8439.
arXiv Detail & Related papers (2021-06-10T17:01:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.