Related papers: High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights

High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights

URL: http://arxiv.org/abs/2503.03558v1
Date: Wed, 05 Mar 2025 14:45:32 GMT
Title: High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights
Authors: Yuna Kato, Mariko Isogawa, Shohei Mori, Hideo Saito, Hiroki Kajita, Yoshifumi Takatsume,
Abstract summary: Occlusion-free video generation is challenging due to surgeons' obstructions in the camera field of view.<n>Prior work has addressed this issue by installing multiple cameras on a surgical light.<n>This paper proposes an algorithm to automate this alignment task.
Score: 9.993966376446744
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Occlusion-free video generation is challenging due to surgeons' obstructions in the camera field of view. Prior work has addressed this issue by installing multiple cameras on a surgical light, hoping some cameras will observe the surgical field with less occlusion. However, this special camera setup poses a new imaging challenge since camera configurations can change every time surgeons move the light, and manual image alignment is required. This paper proposes an algorithm to automate this alignment task. The proposed method detects frames where the lighting system moves, realigns them, and selects the camera with the least occlusion. This algorithm results in a stabilized video with less occlusion. Quantitative results show that our method outperforms conventional approaches. A user study involving medical doctors also confirmed the superiority of our method.

Related papers

TSP-OCS: A Time-Series Prediction for Optimal Camera Selection in Multi-Viewpoint Surgical Video Analysis [19.40791972868592]
We propose a fully supervised learning-based time series prediction method to choose the best shot sequences from multiple simultaneously recorded video streams. Our method achieves competitive accuracy compared to traditional supervised methods, even when predicting over longer time horizons.
arXiv Detail & Related papers (2025-04-09T02:07:49Z)
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers [66.29824750770389]
We analyze camera motion from a first principles perspective, uncovering insights that enable precise 3D camera manipulation.<n>We compound these findings to design the Advanced 3D Camera Control (AC3D) architecture.
arXiv Detail & Related papers (2024-11-27T18:49:13Z)
EvMAPPER: High Altitude Orthomapping with Event Cameras [58.86453514045072]
This work introduces the first orthomosaic approach using event cameras. In contrast to existing methods relying only on CMOS cameras, our approach enables map generation even in challenging light conditions.
arXiv Detail & Related papers (2024-09-26T17:57:15Z)
Redundancy-Aware Camera Selection for Indoor Scene Neural Rendering [54.468355408388675]
We build a similarity matrix that incorporates both the spatial diversity of the cameras and the semantic variation of the images. We apply a diversity-based sampling algorithm to optimize the camera selection. We also develop a new dataset, IndoorTraj, which includes long and complex camera movements captured by humans in virtual indoor environments.
arXiv Detail & Related papers (2024-09-11T08:36:49Z)
WS-SfMLearner: Self-supervised Monocular Depth and Ego-motion Estimation on Surgical Videos with Unknown Camera Parameters [0.0]
Building an accurate and robust self-supervised depth and camera ego-motion estimation system is gaining more attention from the computer vision community. In this work, we aimed to build a self-supervised depth and ego-motion estimation system which can predict not only accurate depth maps and camera pose, but also camera intrinsic parameters.
arXiv Detail & Related papers (2023-08-22T20:35:24Z)
Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras. Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre. Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z)
Deep Selection: A Fully Supervised Camera Selection Network for Surgery Recordings [9.242157746114113]
We use a recording system in which multiple cameras are embedded in the surgical lamp. As the embedded cameras obtain multiple video sequences, we address the task of selecting the camera with the best view of the surgery. Unlike the conventional method, which selects the camera based on the area size of the surgery field, we propose a deep neural network that predicts the camera selection probability from multiple video sequences.
arXiv Detail & Related papers (2023-03-28T13:00:08Z)
Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature [89.57742172078454]
Rolling shutter (RS) image sensors suffer from geometric distortion when the camera and object undergo motion during capture. In this paper, we investigate using rolling shutter with a global reset feature (RSGR) to restore clean global shutter (GS) videos. This feature enables us to turn the rectification problem into a deblur-like one, getting rid of inaccurate and costly explicit motion estimation.
arXiv Detail & Related papers (2022-04-03T02:49:28Z)
Know your sensORs $\unicode{x2013}$ A Modality Study For Surgical Action Classification [39.546197658791]
The medical community seeks to leverage this wealth of data to develop automated methods to advance interventional care, lower costs, and improve patient outcomes. Existing datasets from OR room cameras are thus far limited in size or modalities acquired, leaving it unclear which sensor modalities are best suited for tasks such as recognizing surgical action from videos. This study demonstrates that surgical action recognition performance can vary depending on the image modalities used.
arXiv Detail & Related papers (2022-03-16T15:01:17Z)
Deep Homography Estimation in Dynamic Surgical Scenes for Laparoscopic Camera Motion Extraction [6.56651216023737]
We introduce a method that allows to extract a laparoscope holder's actions from videos of laparoscopic interventions. We synthetically add camera motion to a newly acquired dataset of camera motion free da Vinci surgery image sequences. We find our method transfers from our camera motion free da Vinci surgery dataset to videos of laparoscopic interventions, outperforming classical homography estimation approaches in both, precision by 41%, and runtime on a CPU by 43%.
arXiv Detail & Related papers (2021-09-30T13:05:37Z)
From two rolling shutters to one global shutter [57.431998188805665]
We explore a surprisingly simple camera configuration that makes it possible to undo the rolling shutter distortion. Such a setup is easy and cheap to build and it possesses the geometric constraints needed to correct rolling shutter distortion. We derive equations that describe the underlying geometry for general and special motions and present an efficient method for finding their solutions.
arXiv Detail & Related papers (2020-06-02T22:18:43Z)
DeProCams: Simultaneous Relighting, Compensation and Shape Reconstruction for Projector-Camera Systems [91.45207885902786]
We propose a novel end-to-end trainable model named DeProCams to learn the photometric and geometric mappings of ProCams. DeProCams explicitly decomposes the projector-camera image mappings into three subprocesses: shading attributes estimation, rough direct light estimation and photorealistic neural rendering. In our experiments, DeProCams shows clear advantages over previous arts with promising quality and being fully differentiable.
arXiv Detail & Related papers (2020-03-06T05:49:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.