Related papers: Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy

Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy

URL: http://arxiv.org/abs/2403.05146v2
Date: Sun, 21 Apr 2024 02:44:55 GMT
Title: Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy
Authors: Yuelin Zhang, Wanquan Yan, Kim Yan, Chun Ping Lam, Yufu Qiu, Pengyu Zheng, Raymond Shing-Yan Tang, Shing Shin Cheng,
Abstract summary: A motion-guided dual-camera tracker is proposed to provide reliable endoscope tip position feedback inside a mechanical simulator for endoscopy skill evaluation. The proposed tracker achieves SOTA performance with robust and consistent tracking on dual cameras.
Score: 3.7742691394718078
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gastric simulators with objective educational feedback have been proven useful for endoscopy training. Existing electronic simulators with feedback are however not commonly adopted due to their high cost. In this work, a motion-guided dual-camera tracker is proposed to provide reliable endoscope tip position feedback at a low cost inside a mechanical simulator for endoscopy skill evaluation, tackling several unique challenges. To address the issue of significant appearance variation of the endoscope tip while keeping dual-camera tracking consistency, the cross-camera mutual template strategy (CMT) is proposed to introduce dynamic transient mutual templates to dual-camera tracking. To alleviate disturbance from large occlusion and distortion by the light source from the endoscope tip, the Mamba-based motion-guided prediction head (MMH) is presented to aggregate historical motion with visual tracking. It is the first application of Mamba for object tracking. The proposed tracker was evaluated on datasets captured by low-cost camera pairs during endoscopy procedures performed inside the mechanical simulator. The tracker achieves SOTA performance with robust and consistent tracking on dual cameras. Further downstream evaluation proves that the 3D tip position determined by the proposed tracker enables reliable skill differentiation. The code and dataset are available at https://github.com/PieceZhang/MotionDCTrack

Related papers

Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos [58.156141601478794]
Multi-object tracking (UAVT) aims to track multiple objects while maintaining consistent identities across frames of a given video.<n>Existing methods typically model motion cues and appearance separately, overlooking their interplay and resulting in suboptimal tracking performance.<n>We propose AMOT, which exploits appearance and motion cues through two key components: an Appearance-Motion Consistency (AMC) matrix and a Motion-aware Track Continuation (MTC) module.
arXiv Detail & Related papers (2025-08-03T12:06:47Z)
DUSTrack: Semi-automated point tracking in ultrasound videos [0.559239450391449]
This manuscript introduces DUSTrack, a semi-automated framework for tracking arbitrary points in B-mode ultrasound videos.<n>We combine deep learning and optical flow to deliver high-quality and robust tracking across diverse anatomical structures and motion patterns.<n>As an open-source solution, DUSTrack offers a powerful, flexible framework for point tracking to quantify tissue motion from ultrasound videos.
arXiv Detail & Related papers (2025-07-18T21:22:39Z)
MotionPro: A Precise Motion Controller for Image-to-Video Generation [108.63100943070592]
We present MotionPro, a precise motion controller for image-to-video (I2V) generation.<n>Region-wise trajectory and motion mask are used to regulate fine-grained motion synthesis.<n>Experiments conducted on WebVid-10M and MC-Bench demonstrate the effectiveness of MotionPro.
arXiv Detail & Related papers (2025-05-26T17:59:03Z)
Efficient Motion Prompt Learning for Robust Visual Tracking [58.59714916705317]
We propose a lightweight and plug-and-play motion prompt tracking method.<n>It can be easily integrated into existing vision-based trackers to build a joint tracking framework.<n>Experiments on seven tracking benchmarks demonstrate that the proposed motion module significantly improves the robustness of vision-based trackers.
arXiv Detail & Related papers (2025-05-22T07:22:58Z)
Markerless Tracking-Based Registration for Medical Image Motion Correction [0.4288177321445912]
This study focuses on isolating swallowing dynamics from interfering patient motion in videofluoroscopy. Optical flow methods fail due to artifacts like flickering and instability, making them unreliable for distinguishing different motion groups. We introduce a novel motion correction pipeline that effectively removes disruptive motion while preserving swallowing dynamics.
arXiv Detail & Related papers (2025-03-13T11:18:50Z)
Event-Based Tracking Any Point with Motion-Augmented Temporal Consistency [58.719310295870024]
This paper presents an event-based framework for tracking any point. It tackles the challenges posed by spatial sparsity and motion sensitivity in events. It achieves 150% faster processing with competitive model parameters.
arXiv Detail & Related papers (2024-12-02T09:13:29Z)
No Identity, no problem: Motion through detection for people tracking [48.708733485434394]
We propose exploiting motion clues while providing supervision only for the detections. Our algorithm predicts detection heatmaps at two different times, along with a 2D motion estimate between the two images. We show that our approach delivers state-of-the-art results for single- and multi-view multi-target tracking on the MOT17 and WILDTRACK datasets.
arXiv Detail & Related papers (2024-11-25T15:13:17Z)
SLAM assisted 3D tracking system for laparoscopic surgery [22.36252790404779]
This work proposes a real-time monocular 3D tracking algorithm for post-registration tasks. Experiments from in-vivo and ex-vivo tests demonstrate that the proposed 3D tracking system provides robust 3D tracking.
arXiv Detail & Related papers (2024-09-18T04:00:54Z)
MotionTTT: 2D Test-Time-Training Motion Estimation for 3D Motion Corrected MRI [24.048132427816704]
We propose a deep learning-based test-time-training method for accurate motion estimation. We show that our method can provably reconstruct motion parameters for a simple signal and neural network model.
arXiv Detail & Related papers (2024-09-14T08:51:33Z)
ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking [4.250337979548885]
We propose a motion-based MOT approach with an enhanced temporal motion predictor, ETTrack. Specifically, the motion predictor integrates a transformer model and a Temporal Convolutional Network (TCN) to capture short-term and long-term motion patterns. We show ETTrack achieves a competitive performance compared with state-of-the-art trackers on DanceTrack and SportsMOT.
arXiv Detail & Related papers (2024-05-24T17:51:33Z)
Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction [89.53963284958037]
We propose a novel motion-aware enhancement framework for dynamic scene reconstruction. Specifically, we first establish a correspondence between 3D Gaussian movements and pixel-level flow. For the prevalent deformation-based paradigm that presents a harder optimization problem, a transient-aware deformation auxiliary module is proposed.
arXiv Detail & Related papers (2024-03-18T03:46:26Z)
DivaTrack: Diverse Bodies and Motions from Acceleration-Enhanced Three-Point Trackers [13.258923087528354]
Full-body avatar presence is crucial for immersive social and environmental interactions in digital reality. Current devices only provide three six degrees of freedom (DOF) poses from the headset and two controllers. We propose a deep learning framework, DivaTrack, which outperforms existing methods when applied to diverse body sizes and activities.
arXiv Detail & Related papers (2024-02-14T14:46:03Z)
Delving into Motion-Aware Matching for Monocular 3D Object Tracking [81.68608983602581]
We find that the motion cue of objects along different time frames is critical in 3D multi-object tracking. We propose MoMA-M3T, a framework that mainly consists of three motion-aware components. We conduct extensive experiments on the nuScenes and KITTI datasets to demonstrate our MoMA-M3T achieves competitive performance against state-of-the-art methods.
arXiv Detail & Related papers (2023-08-22T17:53:58Z)
MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor. Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z)
An Effective Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving. Current approaches all follow the Siamese paradigm based on appearance matching. We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z)
QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars [80.05743236282564]
Real-time tracking of human body motion is crucial for immersive experiences in AR/VR. We present a reinforcement learning framework that takes in sparse signals from an HMD and two controllers. We show that a single policy can be robust to diverse locomotion styles, different body sizes, and novel environments.
arXiv Detail & Related papers (2022-09-20T00:25:54Z)
Unsupervised Landmark Detection Based Spatiotemporal Motion Estimation for 4D Dynamic Medical Images [16.759486905827433]
We provide a novel motion estimation framework of Dense-Sparse-Dense (DSD), which comprises two stages. In the first stage, we process the raw dense image to extract sparse landmarks to represent the target organ anatomical topology. In the second stage, we derive the sparse motion displacement from the extracted sparse landmarks of two images of different time points.
arXiv Detail & Related papers (2021-09-30T02:06:02Z)
Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field. It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations. Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z)
Human Leg Motion Tracking by Fusing IMUs and RGB Camera Data Using Extended Kalman Filter [4.189643331553922]
IMU-based systems, as well as Marker-based motion tracking systems, are the most popular methods to track movement due to their low cost of implementation and lightweight. This paper proposes a quaternion-based Extended Kalman filter approach to recover the human leg segments motions with a set of IMU sensors data fused with camera-marker system data.
arXiv Detail & Related papers (2020-11-01T17:54:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.