Related papers: Self-Supervised Multiview Xray Matching

Self-Supervised Multiview Xray Matching

URL: http://arxiv.org/abs/2507.00287v1
Date: Mon, 30 Jun 2025 21:56:14 GMT
Title: Self-Supervised Multiview Xray Matching
Authors: Mohamad Dabboussi, Malo Huard, Yann Gousseau, Pietro Gori,
Abstract summary: Current methods often struggle to establish robust correspondences between different X-ray views.<n>We present a novel self-supervised pipeline that eliminates the need for manual annotation.<n>Our approach incorporates a transformer-based training phase to accurately predict correspondences across two or more X-ray views.
Score: 4.033064933995391
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate interpretation of multi-view radiographs is crucial for diagnosing fractures, muscular injuries, and other anomalies. While significant advances have been made in AI-based analysis of single images, current methods often struggle to establish robust correspondences between different X-ray views, an essential capability for precise clinical evaluations. In this work, we present a novel self-supervised pipeline that eliminates the need for manual annotation by automatically generating a many-to-many correspondence matrix between synthetic X-ray views. This is achieved using digitally reconstructed radiographs (DRR), which are automatically derived from unannotated CT volumes. Our approach incorporates a transformer-based training phase to accurately predict correspondences across two or more X-ray views. Furthermore, we demonstrate that learning correspondences among synthetic X-ray views can be leveraged as a pretraining strategy to enhance automatic multi-view fracture detection on real data. Extensive evaluations on both synthetic and real X-ray datasets show that incorporating correspondences improves performance in multi-view fracture classification.

Related papers

Enhancing Radiographic Disease Detection with MetaCheX, a Context-Aware Multimodal Model [0.0]
Existing deep learning models for chest radiology often neglect patient metadata, limiting diagnostic accuracy and fairness.<n>We introduce MetaCheX, a novel framework that integrates chest X-ray images with structured patient metadata to replicate clinical decision-making.
arXiv Detail & Related papers (2025-09-15T00:44:44Z)
RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z)
Automatic Multi-View X-Ray/CT Registration Using Bone Substructure Contours [0.5949599220326207]
We propose a novel multi-view X-ray/CT registration method for intraoperative bone registration.<n>Method consists of a multi-view, contour-based iterative closest point (ICP) optimization.<n>Method consistently achieves sub-millimeter accuracy with a mRPD 0.67mm compared to 5.35mm by a commercial solution.
arXiv Detail & Related papers (2025-06-16T09:33:37Z)
Enhancing Prohibited Item Detection through X-ray-Specific Augmentation and Contextual Feature Integration [81.11400642272976]
X-ray prohibited item detection faces challenges due to the long-tail distribution and unique characteristics of X-ray imaging.<n>Traditional data augmentation strategies, such as copy-paste and mixup, are ineffective at improving the detection of rare items.<n>We propose the X-ray Imaging-driven Detection Network (XIDNet) to address these challenges.
arXiv Detail & Related papers (2024-11-27T06:13:56Z)
Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation [10.46031380503486]
We introduce a novel method, textbfStructural textbfEntities extraction and patient indications textbfIncorporation (SEI) for chest X-ray report generation. We employ a structural entities extraction (SEE) approach to eliminate presentation-style vocabulary in reports. We propose a cross-modal fusion network to integrate information from X-ray images, similar historical cases, and patient-specific indications.
arXiv Detail & Related papers (2024-05-23T01:29:47Z)
Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans [10.72672892416061]
Over-dosed X-rays superimpose potential risks to human health to some extent. Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume data. We propose CT2X-GAN to synthesize the X-ray images in an end-to-end manner using the content and style disentanglement from three different image domains.
arXiv Detail & Related papers (2024-04-18T04:25:56Z)
Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning [65.54680361074882]
Eye-gaze Guided Multi-modal Alignment (EGMA) framework harnesses eye-gaze data for better alignment of medical visual and textual features. We conduct downstream tasks of image classification and image-text retrieval on four medical datasets.
arXiv Detail & Related papers (2024-03-19T03:59:14Z)
Radiology Report Generation Using Transformers Conditioned with Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information. The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z)
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images. Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists. We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z)
An Empirical Analysis for Zero-Shot Multi-Label Classification on COVID-19 CT Scans and Uncurated Reports [0.5527944417831603]
pandemic resulted in vast repositories of unstructured data, including radiology reports, due to increased medical examinations. Previous research on automated diagnosis of COVID-19 primarily focuses on X-ray images, despite their lower precision compared to computed tomography (CT) scans. In this work, we leverage unstructured data from a hospital and harness the fine-grained details offered by CT scans to perform zero-shot multi-label classification based on contrastive visual language learning.
arXiv Detail & Related papers (2023-09-04T17:58:01Z)
Cross-Modal Contrastive Learning for Abnormality Classification and Localization in Chest X-rays with Radiomics using a Feedback Loop [63.81818077092879]
We propose an end-to-end semi-supervised cross-modal contrastive learning framework for medical images. We first apply an image encoder to classify the chest X-rays and to generate the image features. The radiomic features are then passed through another dedicated encoder to act as the positive sample for the image features generated from the same chest X-ray.
arXiv Detail & Related papers (2021-04-11T09:16:29Z)
Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning. It aims to extract both the common information and the complementary information in an adversarial setting. In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z)
XraySyn: Realistic View Synthesis From a Single Radiograph Through CT Priors [118.27130593216096]
A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane. To the best of our knowledge, this is the first work on radiograph view synthesis. We show that by gaining an understanding of radiography in 3D space, our method can be applied to radiograph bone extraction and suppression without groundtruth bone labels.
arXiv Detail & Related papers (2020-12-04T05:08:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.