Related papers: Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion

Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion

URL: http://arxiv.org/abs/2403.03217v1
Date: Tue, 5 Mar 2024 18:58:55 GMT
Title: Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion
Authors: Meng Zheng, Benjamin Planche, Xuan Gong, Fan Yang, Terrence Chen, Ziyan Wu
Abstract summary: 3D patient body modeling is critical to the success of automated patient positioning for smart medical scanning and operating rooms. Existing CNN-based end-to-end patient modeling solutions typically require customized network designs demanding large amount of relevant training data. We propose a generic modularized 3D patient modeling method consists of (a) a multi-modal keypoint detection module with attentive fusion for 2D patient joint localization. We demonstrate the efficacy of the proposed method by extensive patient positioning experiments on both public and clinical data.
Score: 32.71972792352939
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D patient body modeling is critical to the success of automated patient positioning for smart medical scanning and operating rooms. Existing CNN-based end-to-end patient modeling solutions typically require a) customized network designs demanding large amount of relevant training data, covering extensive realistic clinical scenarios (e.g., patient covered by sheets), which leads to suboptimal generalizability in practical deployment, b) expensive 3D human model annotations, i.e., requiring huge amount of manual effort, resulting in systems that scale poorly. To address these issues, we propose a generic modularized 3D patient modeling method consists of (a) a multi-modal keypoint detection module with attentive fusion for 2D patient joint localization, to learn complementary cross-modality patient body information, leading to improved keypoint localization robustness and generalizability in a wide variety of imaging (e.g., CT, MRI etc.) and clinical scenarios (e.g., heavy occlusions); and (b) a self-supervised 3D mesh regression module which does not require expensive 3D mesh parameter annotations to train, bringing immediate cost benefits for clinical deployment. We demonstrate the efficacy of the proposed method by extensive patient positioning experiments on both public and clinical data. Our evaluation results achieve superior patient positioning performance across various imaging modalities in real clinical scenarios.

Related papers

ClinicalFMamba: Advancing Clinical Assessment using Mamba-based Multimodal Neuroimaging Fusion [7.0879234284391455]
Multimodal medical image fusion integrates complementary information from different imaging modalities to enhance diagnostic accuracy and treatment planning.<n>CNNs excel at local feature extraction but struggle to model global context effectively.<n>Transformers achieve superior long-range modeling at the cost of quadratic computational complexity.<n>Recent State Space Models (SSMs) offer a promising alternative.<n>We propose ClinicalFMamba, a novel end-to-end CNN-Mamba hybrid architecture.
arXiv Detail & Related papers (2025-08-05T02:25:53Z)
Multi-modal 3D Pose and Shape Estimation with Computed Tomography [6.614339747258751]
We present the first multi-modal in-bed patient 3D pose and shape estimation network that fuses detailed geometric features extracted from CT scans with depth maps. mPSE-CT robustly reconstructs occluded body regions and enhances the accuracy of the estimated 3D human mesh model.
arXiv Detail & Related papers (2025-03-25T07:24:58Z)
MG-3D: Multi-Grained Knowledge-Enhanced 3D Medical Vision-Language Pre-training [7.968487067774351]
3D medical image analysis is pivotal in numerous clinical applications. Large-scale vision-language pre-training remains underexplored in 3D medical image analysis. We propose MG-3D, pre-trained on large-scale data (47.1K)
arXiv Detail & Related papers (2024-12-08T09:45:59Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions. Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework. Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z)
Towards Synergistic Deep Learning Models for Volumetric Cirrhotic Liver Segmentation in MRIs [1.5228650878164722]
Liver cirrhosis, a leading cause of global mortality, requires precise segmentation of ROIs for effective disease monitoring and treatment planning. Existing segmentation models often fail to capture complex feature interactions and generalize across diverse datasets. We propose a novel synergistic theory that leverages complementary latent spaces for enhanced feature interaction modeling.
arXiv Detail & Related papers (2024-08-08T14:41:32Z)
Inter-slice Super-resolution of Magnetic Resonance Images by Pre-training and Self-supervised Fine-tuning [49.197385954021456]
In clinical practice, 2D magnetic resonance (MR) sequences are widely adopted. While individual 2D slices can be stacked to form a 3D volume, the relatively large slice spacing can pose challenges for visualization and subsequent analysis tasks. To reduce slice spacing, deep-learning-based super-resolution techniques are widely investigated. Most current solutions require a substantial number of paired high-resolution and low-resolution images for supervised training, which are typically unavailable in real-world scenarios.
arXiv Detail & Related papers (2024-06-10T02:20:26Z)
SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model [3.2554912675000818]
We introduce SAM3D, a new approach to semi-automatic zero-shot segmentation of 3D images building on the existing Segment Anything Model. We achieve fast and accurate segmentations in 3D images with a four-step strategy involving: user prompting with 3D polylines, volume slicing along multiple axes, slice-wide inference with a pretrained model, and recomposition and refinement in 3D.
arXiv Detail & Related papers (2024-05-10T19:26:17Z)
Deep Learning-Based Brain Image Segmentation for Automated Tumour Detection [0.0]
The objective is to leverage state-of-the-art convolutional neural networks (CNNs) on a large dataset of brain MRI scans for segmentation. The proposed methodology applies pre-processing techniques for enhanced performance and generalizability.
arXiv Detail & Related papers (2024-04-06T15:09:49Z)
Three-dimensional Bone Image Synthesis with Generative Adversarial Networks [2.499907423888049]
This work demonstrates that three-dimensional generative adversarial networks (GANs) can be efficiently trained to generate high-resolution medical volumes. GAN inversion is successfully implemented for the three-dimensional setting and used for extensive research on model interpretability.
arXiv Detail & Related papers (2023-10-26T08:08:17Z)
On the Localization of Ultrasound Image Slices within Point Distribution Models [84.27083443424408]
Thyroid disorders are most commonly diagnosed using high-resolution Ultrasound (US) Longitudinal tracking is a pivotal diagnostic protocol for monitoring changes in pathological thyroid morphology. We present a framework for automated US image slice localization within a 3D shape representation.
arXiv Detail & Related papers (2023-09-01T10:10:46Z)
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans [72.04652116817238]
We propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification. We also exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results.
arXiv Detail & Related papers (2021-01-14T03:45:01Z)
Hierarchical Amortized Training for Memory-efficient High Resolution 3D GAN [52.851990439671475]
We propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by using different configurations between training and inference. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation.
arXiv Detail & Related papers (2020-08-05T02:33:04Z)
Modelling the Distribution of 3D Brain MRI using a 2D Slice VAE [66.63629641650572]
We propose a method to model 3D MR brain volumes distribution by combining a 2D slice VAE with a Gaussian model that captures the relationships between slices. We also introduce a novel evaluation method for generated volumes that quantifies how well their segmentations match those of true brain anatomy.
arXiv Detail & Related papers (2020-07-09T13:23:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.