Related papers: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery

Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery

URL: http://arxiv.org/abs/2512.08577v1
Date: Tue, 09 Dec 2025 13:15:32 GMT
Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery
Authors: Yuna Kato, Shohei Mori, Hideo Saito, Yoshifumi Takatsume, Hiroki Kajita, Mariko Isogawa,
Abstract summary: The proposed method identifies frames in which the lighting system moves, realigns them, and selects the camera with the least occlusion to generate a video that consistently presents the surgical field from a fixed perspective.<n>A user study involving surgeons demonstrated that videos generated by our method were superior to those produced by conventional methods in terms of the ease of confirming the surgical area and the comfort during video viewing.
Score: 12.046186466617696
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Video recordings of open surgeries are greatly required for education and research purposes. However, capturing unobstructed videos is challenging since surgeons frequently block the camera field of view. To avoid occlusion, the positions and angles of the camera must be frequently adjusted, which is highly labor-intensive. Prior work has addressed this issue by installing multiple cameras on a shadowless lamp and arranging them to fully surround the surgical area. This setup increases the chances of some cameras capturing an unobstructed view. However, manual image alignment is needed in post-processing since camera configurations change every time surgeons move the lamp for optimal lighting. This paper aims to fully automate this alignment task. The proposed method identifies frames in which the lighting system moves, realigns them, and selects the camera with the least occlusion to generate a video that consistently presents the surgical field from a fixed perspective. A user study involving surgeons demonstrated that videos generated by our method were superior to those produced by conventional methods in terms of the ease of confirming the surgical area and the comfort during video viewing. Additionally, our approach showed improvements in video quality over existing techniques. Furthermore, we implemented several synthesis options for the proposed view-synthesis method and conducted a user study to assess surgeons' preferences for each option.

Related papers

SurgLLM: A Versatile Large Multimodal Model with Spatial Focus and Temporal Awareness for Surgical Video Understanding [75.00667948967848]
The SurgLLM framework is a large multimodal model tailored for versatile surgical video understanding tasks.<n>To empower the spatial focus of surgical videos, we first devise Surgical Context-aware Multimodal Pretraining (Surg-Pretrain) for the video encoder of SurgLLM.<n>To incorporate surgical temporal knowledge into SurgLLM, we further propose Temporal-aware Multimodal Tuning (TM-Tuning) to enhance temporal reasoning with interleaved multimodal embeddings.
arXiv Detail & Related papers (2025-08-30T04:36:41Z)
SurgVidLM: Towards Multi-grained Surgical Video Understanding with Large Language Model [67.8359850515282]
SurgVidLM is the first video language model designed to address both full and fine-grained surgical video comprehension.<n>We show that SurgVidLM significantly outperforms state-of-the-art Vid-LLMs of comparable parameter scale in both full and fine-grained video understanding tasks.
arXiv Detail & Related papers (2025-06-22T02:16:18Z)
TSP-OCS: A Time-Series Prediction for Optimal Camera Selection in Multi-Viewpoint Surgical Video Analysis [19.40791972868592]
We propose a fully supervised learning-based time series prediction method to choose the best shot sequences from multiple simultaneously recorded video streams.<n>Our method achieves competitive accuracy compared to traditional supervised methods, even when predicting over longer time horizons.
arXiv Detail & Related papers (2025-04-09T02:07:49Z)
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video [72.42376733537925]
ReCamMaster is a camera-controlled generative video re-rendering framework.<n>It reproduces the dynamic scene of an input video at novel camera trajectories.<n>Our method also finds promising applications in video stabilization, super-resolution, and outpainting.
arXiv Detail & Related papers (2025-03-14T17:59:31Z)
High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights [9.993966376446744]
Occlusion-free video generation is challenging due to surgeons' obstructions in the camera field of view.<n>Prior work has addressed this issue by installing multiple cameras on a surgical light.<n>This paper proposes an algorithm to automate this alignment task.
arXiv Detail & Related papers (2025-03-05T14:45:32Z)
Vision-Based Neurosurgical Guidance: Unsupervised Localization and Camera-Pose Prediction [41.91807060434709]
Localizing oneself during endoscopic procedures can be problematic due to the lack of distinguishable textures and landmarks. We present a deep learning method based on anatomy recognition, that constructs a surgical path in an unsupervised manner from surgical videos.
arXiv Detail & Related papers (2024-05-15T14:09:11Z)
BASED: Bundle-Adjusting Surgical Endoscopic Dynamic Video Reconstruction using Neural Radiance Fields [5.773068487121897]
Reconstruction of deformable scenes from endoscopic videos is important for many applications.<n>Our work adopts the Neural Radiance Fields (NeRF) approach to learning 3D implicit representations of scenes.<n>We demonstrate this approach on endoscopic surgical scenes from robotic surgery.
arXiv Detail & Related papers (2023-09-27T00:20:36Z)
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures [50.09187683845788]
Recent advancements in surgical computer vision applications have been driven by vision-only models.<n>These methods rely on manually annotated surgical videos to predict a fixed set of object categories.<n>In this work, we put forward the idea that the surgical video lectures available through open surgical e-learning platforms can provide effective vision and language supervisory signals.
arXiv Detail & Related papers (2023-07-27T22:38:12Z)
Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose Estimation of Surgical Instruments [64.59698930334012]
We present a multi-camera capture setup consisting of static and head-mounted cameras.<n>Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.<n>Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z)
Deep Selection: A Fully Supervised Camera Selection Network for Surgery Recordings [9.242157746114113]
We use a recording system in which multiple cameras are embedded in the surgical lamp. As the embedded cameras obtain multiple video sequences, we address the task of selecting the camera with the best view of the surgery. Unlike the conventional method, which selects the camera based on the area size of the surgery field, we propose a deep neural network that predicts the camera selection probability from multiple video sequences.
arXiv Detail & Related papers (2023-03-28T13:00:08Z)
The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing [90.59584961661345]
This work introduces the Anatomy of Video Editing, a dataset, and benchmark, to foster research in AI-assisted video editing. Our benchmark suite focuses on video editing tasks, beyond visual effects, such as automatic footage organization and assisted video assembling. To enable research on these fronts, we annotate more than 1.5M tags, with relevant concepts to cinematography, from 196176 shots sampled from movie scenes.
arXiv Detail & Related papers (2022-07-20T10:53:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.