Related papers: DINOMotion: advanced robust tissue motion tracking with DINOv2 in 2D-Cine MRI-guided radiotherapy

DINOMotion: advanced robust tissue motion tracking with DINOv2 in 2D-Cine MRI-guided radiotherapy

URL: http://arxiv.org/abs/2508.10260v1
Date: Thu, 14 Aug 2025 01:02:26 GMT
Title: DINOMotion: advanced robust tissue motion tracking with DINOv2 in 2D-Cine MRI-guided radiotherapy
Authors: Soorena Salari, Catherine Spino, Laurie-Anne Pharand, Fabienne Lathuiliere, Hassan Rivaz, Silvain Beriault, Yiming Xiao,
Abstract summary: We introduce DINOMotion, a novel deep learning framework for robust, efficient, and interpretable motion tracking.<n>DINOMotion automatically detects corresponding landmarks to derive optimal image registration, enhancing interpretability.<n>Our experiments on volunteer and patient datasets demonstrate its effectiveness in estimating both linear and nonlinear transformations.
Score: 1.9458647657637413
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate tissue motion tracking is critical to ensure treatment outcome and safety in 2D-Cine MRI-guided radiotherapy. This is typically achieved by registration of sequential images, but existing methods often face challenges with large misalignments and lack of interpretability. In this paper, we introduce DINOMotion, a novel deep learning framework based on DINOv2 with Low-Rank Adaptation (LoRA) layers for robust, efficient, and interpretable motion tracking. DINOMotion automatically detects corresponding landmarks to derive optimal image registration, enhancing interpretability by providing explicit visual correspondences between sequential images. The integration of LoRA layers reduces trainable parameters, improving training efficiency, while DINOv2's powerful feature representations offer robustness against large misalignments. Unlike iterative optimization-based methods, DINOMotion directly computes image registration at test time. Our experiments on volunteer and patient datasets demonstrate its effectiveness in estimating both linear and nonlinear transformations, achieving Dice scores of 92.07% for the kidney, 90.90% for the liver, and 95.23% for the lung, with corresponding Hausdorff distances of 5.47 mm, 8.31 mm, and 6.72 mm, respectively. DINOMotion processes each scan in approximately 30ms and consistently outperforms state-of-the-art methods, particularly in handling large misalignments. These results highlight its potential as a robust and interpretable solution for real-time motion tracking in 2D-Cine MRI-guided radiotherapy.

Related papers

DRBD-Mamba for Robust and Efficient Brain Tumor Segmentation with Analytical Insights [54.87947751720332]
Accurate brain tumor segmentation is significant for clinical diagnosis and treatment.<n>Mamba-based State Space Models have demonstrated promising performance.<n>We propose a dual-resolution bi-directional Mamba that captures multi-scale long-range dependencies with minimal computational overhead.
arXiv Detail & Related papers (2025-10-16T07:31:21Z)
The Brain Resection Multimodal Image Registration (ReMIND2Reg) 2025 Challenge [42.51640997446028]
The ReMIND2Reg 2025 Challenge provides the largest public benchmark for this task, built upon the ReMIND dataset.<n>It offers 99 training cases, 5 validation cases, and 10 private test cases comprising paired 3D ceT1 MRI, T2 MRI, and post-resection 3D iUS volumes.<n>Data are provided without annotations for training, while validation and test performance are evaluated on manually annotated anatomical landmarks.
arXiv Detail & Related papers (2025-08-13T09:31:06Z)
Modality-agnostic, patient-specific digital twins modeling temporally varying digestive motion [8.884066499888718]
Clinical implementation of deformable image registration (DIR) requires voxel-based spatial accuracy metrics.<n>Patient-specific digital twins (DTs) modeling temporally varying motion were created to assess the accuracy of DIR methods.
arXiv Detail & Related papers (2025-07-02T17:22:47Z)
Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging [41.446379453352534]
Latent Diffusion Autoencoder (LDAE) is a novel encoder-decoder diffusion-based framework for efficient and meaningful unsupervised learning in medical imaging.<n>This study focuses on Alzheimer disease (AD) using brain MR from the ADNI database as a case study.
arXiv Detail & Related papers (2025-04-11T15:37:46Z)
A Novel Automatic Real-time Motion Tracking Method in MRI-guided Radiotherapy Using Enhanced Tracking-Learning-Detection Framework with Automatic Segmentation [9.332679162161428]
Accurate motion tracking in MRI-guided Radiotherapy (MRIgRT) is essential for effective treatment delivery.<n>This study aimed to enhance motion tracking precision in MRIgRT through an automatic real-time markerless tracking method.
arXiv Detail & Related papers (2024-11-12T03:01:39Z)
Deep Regression 2D-3D Ultrasound Registration for Liver Motion Correction in Focal Tumor Thermal Ablation [5.585625844344932]
Liver tumor ablation procedures require accurate placement of the needle applicator at the tumor centroid. Image registration techniques can aid in interpreting anatomical details and identifying tumors, but their clinical application has been hindered by the tradeoff between alignment accuracy and runtime performance. We propose a 2D-3D US registration approach to enable intra-procedural alignment that mitigates errors caused by liver motion.
arXiv Detail & Related papers (2024-10-03T15:24:45Z)
A self-attention model for robust rigid slice-to-volume registration of functional MRI [4.615338063719135]
Head motion during fMRI scans can result in distortion, biased analyses, and increased costs. We introduce an end-to-end SVR model for aligning 2D fMRI slices with a 3D reference volume. Our model achieves competitive performance in terms of alignment accuracy compared to state-of-the-art deep learning-based methods.
arXiv Detail & Related papers (2024-04-06T08:02:18Z)
Rotational Augmented Noise2Inverse for Low-dose Computed Tomography Reconstruction [83.73429628413773]
Supervised deep learning methods have shown the ability to remove noise in images but require accurate ground truth. We propose a novel self-supervised framework for LDCT, in which ground truth is not required for training the convolutional neural network (CNN) Numerical and experimental results show that the reconstruction accuracy of N2I with sparse views is degrading while the proposed rotational augmented Noise2Inverse (RAN2I) method keeps better image quality over a different range of sampling angles.
arXiv Detail & Related papers (2023-12-19T22:40:51Z)
Weakly supervised segmentation of intracranial aneurysms using a novel 3D focal modulation UNet [0.5106162890866905]
We propose FocalSegNet, a novel 3D focal modulation UNet, to detect an aneurysm and offer an initial, coarse segmentation of it from time-of-flight MRA image patches. We trained and evaluated our model on a public dataset, and in terms of UIA detection, our model showed a low false-positive rate of 0.21 and a high sensitivity of 0.80.
arXiv Detail & Related papers (2023-08-06T03:28:08Z)
3-Dimensional Deep Learning with Spatial Erasing for Unsupervised Anomaly Segmentation in Brain MRI [55.97060983868787]
We investigate whether using increased spatial context by using MRI volumes combined with spatial erasing leads to improved unsupervised anomaly segmentation performance. We compare 2D variational autoencoder (VAE) to their 3D counterpart, propose 3D input erasing, and systemically study the impact of the data set size on the performance. Our best performing 3D VAE with input erasing leads to an average DICE score of 31.40% compared to 25.76% for the 2D VAE.
arXiv Detail & Related papers (2021-09-14T09:17:27Z)
Revisiting 3D Context Modeling with Supervised Pre-training for Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices. With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset. The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z)
Volumetric Attention for 3D Medical Image Segmentation and Detection [53.041572035020344]
A volumetric attention(VA) module for 3D medical image segmentation and detection is proposed. VA attention is inspired by recent advances in video processing, enables 2.5D networks to leverage context information along the z direction. Its integration in the Mask R-CNN is shown to enable state-of-the-art performance on the Liver Tumor (LiTS) Challenge.
arXiv Detail & Related papers (2020-04-04T18:55:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.