Self-Supervised Multiview Xray Matching
- URL: http://arxiv.org/abs/2507.00287v1
- Date: Mon, 30 Jun 2025 21:56:14 GMT
- Title: Self-Supervised Multiview Xray Matching
- Authors: Mohamad Dabboussi, Malo Huard, Yann Gousseau, Pietro Gori,
- Abstract summary: Current methods often struggle to establish robust correspondences between different X-ray views.<n>We present a novel self-supervised pipeline that eliminates the need for manual annotation.<n>Our approach incorporates a transformer-based training phase to accurately predict correspondences across two or more X-ray views.
- Score: 4.033064933995391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate interpretation of multi-view radiographs is crucial for diagnosing fractures, muscular injuries, and other anomalies. While significant advances have been made in AI-based analysis of single images, current methods often struggle to establish robust correspondences between different X-ray views, an essential capability for precise clinical evaluations. In this work, we present a novel self-supervised pipeline that eliminates the need for manual annotation by automatically generating a many-to-many correspondence matrix between synthetic X-ray views. This is achieved using digitally reconstructed radiographs (DRR), which are automatically derived from unannotated CT volumes. Our approach incorporates a transformer-based training phase to accurately predict correspondences across two or more X-ray views. Furthermore, we demonstrate that learning correspondences among synthetic X-ray views can be leveraged as a pretraining strategy to enhance automatic multi-view fracture detection on real data. Extensive evaluations on both synthetic and real X-ray datasets show that incorporating correspondences improves performance in multi-view fracture classification.
Related papers
- RadFabric: Agentic AI System with Reasoning Capability for Radiology [61.25593938175618]
RadFabric is a multi agent, multimodal reasoning framework that unifies visual and textual analysis for comprehensive CXR interpretation.<n>System employs specialized CXR agents for pathology detection, an Anatomical Interpretation Agent to map visual findings to precise anatomical structures, and a Reasoning Agent powered by large multimodal reasoning models to synthesize visual, anatomical, and clinical data into transparent and evidence based diagnoses.
arXiv Detail & Related papers (2025-06-17T03:10:33Z) - Automatic Multi-View X-Ray/CT Registration Using Bone Substructure Contours [0.5949599220326207]
We propose a novel multi-view X-ray/CT registration method for intraoperative bone registration.<n>Method consists of a multi-view, contour-based iterative closest point (ICP) optimization.<n>Method consistently achieves sub-millimeter accuracy with a mRPD 0.67mm compared to 5.35mm by a commercial solution.
arXiv Detail & Related papers (2025-06-16T09:33:37Z) - Enhancing Prohibited Item Detection through X-ray-Specific Augmentation and Contextual Feature Integration [81.11400642272976]
X-ray prohibited item detection faces challenges due to the long-tail distribution and unique characteristics of X-ray imaging.<n>Traditional data augmentation strategies, such as copy-paste and mixup, are ineffective at improving the detection of rare items.<n>We propose the X-ray Imaging-driven Detection Network (XIDNet) to address these challenges.
arXiv Detail & Related papers (2024-11-27T06:13:56Z) - Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation [10.46031380503486]
We introduce a novel method, textbfStructural textbfEntities extraction and patient indications textbfIncorporation (SEI) for chest X-ray report generation.
We employ a structural entities extraction (SEE) approach to eliminate presentation-style vocabulary in reports.
We propose a cross-modal fusion network to integrate information from X-ray images, similar historical cases, and patient-specific indications.
arXiv Detail & Related papers (2024-05-23T01:29:47Z) - Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans [10.72672892416061]
Over-dosed X-rays superimpose potential risks to human health to some extent.
Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume data.
We propose CT2X-GAN to synthesize the X-ray images in an end-to-end manner using the content and style disentanglement from three different image domains.
arXiv Detail & Related papers (2024-04-18T04:25:56Z) - Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning [65.54680361074882]
Eye-gaze Guided Multi-modal Alignment (EGMA) framework harnesses eye-gaze data for better alignment of medical visual and textual features.
We conduct downstream tasks of image classification and image-text retrieval on four medical datasets.
arXiv Detail & Related papers (2024-03-19T03:59:14Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - An Empirical Analysis for Zero-Shot Multi-Label Classification on
COVID-19 CT Scans and Uncurated Reports [0.5527944417831603]
pandemic resulted in vast repositories of unstructured data, including radiology reports, due to increased medical examinations.
Previous research on automated diagnosis of COVID-19 primarily focuses on X-ray images, despite their lower precision compared to computed tomography (CT) scans.
In this work, we leverage unstructured data from a hospital and harness the fine-grained details offered by CT scans to perform zero-shot multi-label classification based on contrastive visual language learning.
arXiv Detail & Related papers (2023-09-04T17:58:01Z) - Cross-Modal Contrastive Learning for Abnormality Classification and
Localization in Chest X-rays with Radiomics using a Feedback Loop [63.81818077092879]
We propose an end-to-end semi-supervised cross-modal contrastive learning framework for medical images.
We first apply an image encoder to classify the chest X-rays and to generate the image features.
The radiomic features are then passed through another dedicated encoder to act as the positive sample for the image features generated from the same chest X-ray.
arXiv Detail & Related papers (2021-04-11T09:16:29Z) - XraySyn: Realistic View Synthesis From a Single Radiograph Through CT
Priors [118.27130593216096]
A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane.
To the best of our knowledge, this is the first work on radiograph view synthesis.
We show that by gaining an understanding of radiography in 3D space, our method can be applied to radiograph bone extraction and suppression without groundtruth bone labels.
arXiv Detail & Related papers (2020-12-04T05:08:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.