Beyond the Desktop: XR-Driven Segmentation with Meta Quest 3 and MX Ink
- URL: http://arxiv.org/abs/2506.04858v1
- Date: Thu, 05 Jun 2025 10:25:46 GMT
- Title: Beyond the Desktop: XR-Driven Segmentation with Meta Quest 3 and MX Ink
- Authors: Lisle Faray de Paiva, Gijs Luijten, Ana Sofia Ferreira Santos, Moon Kim, Behrus Puladi, Jens Kleesiek, Jan Egger,
- Abstract summary: This study implements and evaluates the usability and clinical applicability of an extended reality (XR)-based segmentation tool for anatomical CT scans.<n>We develop an immersive interface enabling real-time interaction with 2D and 3D medical imaging data in a customizable workspace.<n>A user study with a public craniofacial CT dataset demonstrated the tool's viability, achieving a System Usability Scale (SUS) score of 66, within the expected range for medical applications.
- Score: 1.7758191385707351
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Medical imaging segmentation is essential in clinical settings for diagnosing diseases, planning surgeries, and other procedures. However, manual annotation is a cumbersome and effortful task. To mitigate these aspects, this study implements and evaluates the usability and clinical applicability of an extended reality (XR)-based segmentation tool for anatomical CT scans, using the Meta Quest 3 headset and Logitech MX Ink stylus. We develop an immersive interface enabling real-time interaction with 2D and 3D medical imaging data in a customizable workspace designed to mitigate workflow fragmentation and cognitive demands inherent to conventional manual segmentation tools. The platform combines stylus-driven annotation, mirroring traditional pen-on-paper workflows, with instant 3D volumetric rendering. A user study with a public craniofacial CT dataset demonstrated the tool's foundational viability, achieving a System Usability Scale (SUS) score of 66, within the expected range for medical applications. Participants highlighted the system's intuitive controls (scoring 4.1/5 for self-descriptiveness on ISONORM metrics) and spatial interaction design, with qualitative feedback highlighting strengths in hybrid 2D/3D navigation and realistic stylus ergonomics. While users identified opportunities to enhance task-specific precision and error management, the platform's core workflow enabled dynamic slice adjustment, reducing cognitive load compared to desktop tools. Results position the XR-stylus paradigm as a promising foundation for immersive segmentation tools, with iterative refinements targeting haptic feedback calibration and workflow personalization to advance adoption in preoperative planning.
Related papers
- ClipGS: Clippable Gaussian Splatting for Interactive Cinematic Visualization of Volumetric Medical Data [51.095474325541794]
We introduce ClipGS, an innovative Gaussian splatting framework with the clipping plane supported, for interactive cinematic visualization of medical data.<n>We validate our method on five volumetric medical data, and reach an average 36.635 PSNR rendering quality with 156 FPS and 16.1 MB model size.
arXiv Detail & Related papers (2025-07-09T08:24:28Z) - nnLandmark: A Self-Configuring Method for 3D Medical Landmark Detection [35.41030755599218]
This work introduces nnLandmark, a self-configuring deep learning framework for 3D medical landmark detection.<n>nnLandmark eliminates the need for manual parameter tuning, offering out-of-the-box usability.<n>It achieves state-of-the-art accuracy across two public datasets, with a radial mean error (MRE) of 1.5 mm on the Mandibular Molar Landmark (MML) dental CT dataset and 1.2 mm for anatomical fiducials on a brain MRI dataset (AFIDs)<n>nnLandmark establishes a reliable baseline for 3D landmark detection, supporting research in anatomical localization and
arXiv Detail & Related papers (2025-04-09T09:53:39Z) - Advanced XR-Based 6-DOF Catheter Tracking System for Immersive Cardiac Intervention Training [37.69303106863453]
This paper presents a novel system for real-time 3D tracking and visualization of intracardiac echocardiography (ICE) catheters.
A custom 3D-printed setup captures biplane video of the catheter, while a specialized computer vision algorithm reconstructs its 3D trajectory.
The system's data is integrated into an interactive Unity-based environment, rendered through the Meta Quest 3 XR headset.
arXiv Detail & Related papers (2024-11-04T21:05:40Z) - Multi-Layer Gaussian Splatting for Immersive Anatomy Visualization [1.0580610673031074]
In medical image visualization, path tracing of volumetric medical data like CT scans produces lifelike visualizations.
We propose a novel approach utilizing GS to create an efficient but static intermediate representation of CT scans.
Our approach achieves interactive frame rates while preserving anatomical structures, with quality adjustable to the target hardware.
arXiv Detail & Related papers (2024-10-22T12:56:58Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Multi-task Learning with 3D-Aware Regularization [55.97507478913053]
We propose a structured 3D-aware regularizer which interfaces multiple tasks through the projection of features extracted from an image encoder to a shared 3D feature space.
We show that the proposed method is architecture agnostic and can be plugged into various prior multi-task backbones to improve their performance.
arXiv Detail & Related papers (2023-10-02T08:49:56Z) - DEEPBEAS3D: Deep Learning and B-Spline Explicit Active Surfaces [3.560949684583438]
We propose a novel 3D extension of an interactive segmentation framework that represents a segmentation from a convolutional neural network (CNN) as a B-spline explicit active surface (BEAS)
BEAS ensures segmentations are smooth in 3D space, increasing anatomical plausibility, while allowing the user to precisely edit the 3D surface.
Experimental results show that: 1) the proposed framework gives the user explicit control of the surface contour; 2) the perceived workload calculated via the NASA-TLX index was reduced by 30% compared to VOCAL; and 3) it required 7 0% (170 seconds) less user time than VOCAL
arXiv Detail & Related papers (2023-09-05T15:54:35Z) - Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose Estimation of Surgical Instruments [64.59698930334012]
We present a multi-camera capture setup consisting of static and head-mounted cameras.<n>Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.<n>Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z) - Attentive Symmetric Autoencoder for Brain MRI Segmentation [56.02577247523737]
We propose a novel Attentive Symmetric Auto-encoder based on Vision Transformer (ViT) for 3D brain MRI segmentation tasks.
In the pre-training stage, the proposed auto-encoder pays more attention to reconstruct the informative patches according to the gradient metrics.
Experimental results show that our proposed attentive symmetric auto-encoder outperforms the state-of-the-art self-supervised learning methods and medical image segmentation models.
arXiv Detail & Related papers (2022-09-19T09:43:19Z) - HMD-EgoPose: Head-Mounted Display-Based Egocentric Marker-Less Tool and
Hand Pose Estimation for Augmented Surgical Guidance [0.0]
We present HMD-EgoPose, a single-shot learning-based approach to hand and object pose estimation.
We demonstrate state-of-the-art performance on a benchmark dataset for marker-less hand and surgical instrument pose tracking.
arXiv Detail & Related papers (2022-02-24T04:07:34Z) - Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical
Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views.
We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.