Related papers: GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering

GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering

URL: http://arxiv.org/abs/2506.23957v1
Date: Mon, 30 Jun 2025 15:24:27 GMT
Title: GaVS: 3D-Grounded Video Stabilization via Temporally-Consistent Local Reconstruction and Rendering
Authors: Zinuo You, Stamatios Georgoulis, Anpei Chen, Siyu Tang, Dengxin Dai,
Abstract summary: Video stabilization is pivotal for video processing, as it removes unwanted shakiness while preserving the original user motion intent.<n>Existing approaches, depending on the domain they operate, suffer from several issues that degrade the user experience.<n>We introduce textbfGaVS, a novel 3D-grounded approach that reformulates video stabilization as a temporally-consistent local reconstruction and rendering' paradigm.
Score: 54.489285024494855
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Video stabilization is pivotal for video processing, as it removes unwanted shakiness while preserving the original user motion intent. Existing approaches, depending on the domain they operate, suffer from several issues (e.g. geometric distortions, excessive cropping, poor generalization) that degrade the user experience. To address these issues, we introduce \textbf{GaVS}, a novel 3D-grounded approach that reformulates video stabilization as a temporally-consistent `local reconstruction and rendering' paradigm. Given 3D camera poses, we augment a reconstruction model to predict Gaussian Splatting primitives, and finetune it at test-time, with multi-view dynamics-aware photometric supervision and cross-frame regularization, to produce temporally-consistent local reconstructions. The model are then used to render each stabilized frame. We utilize a scene extrapolation module to avoid frame cropping. Our method is evaluated on a repurposed dataset, instilled with 3D-grounded information, covering samples with diverse camera motions and scene dynamics. Quantitatively, our method is competitive with or superior to state-of-the-art 2D and 2.5D approaches in terms of conventional task metrics and new geometry consistency. Qualitatively, our method produces noticeably better results compared to alternatives, validated by the user study.

Related papers

Revisiting an Old Perspective Projection for Monocular 3D Morphable Models Regression [0.0]
We introduce a novel camera model for monocular 3D Morphable Model (3DMM) regression methods.<n>We capture the perspective distortion effect commonly seen in close-up facial images.
arXiv Detail & Related papers (2026-03-05T08:52:20Z)
JOintGS: Joint Optimization of Cameras, Bodies and 3D Gaussians for In-the-Wild Monocular Reconstruction [18.636227266388218]
We present JOintGS, a unified framework that jointly optimize camera extrinsics, human poses, and 3D Gaussian representations.<n>Experiments on NeuMan and EMDB datasets demonstrate that JOintGS achieves superior reconstruction quality.
arXiv Detail & Related papers (2026-02-04T08:33:51Z)
ShapeGen4D: Towards High Quality 4D Shape Generation from Videos [85.45517487721257]
We introduce a native video-to-4D shape generation framework that synthesizes a single dynamic 3D representation end-to-end from the video.<n>Our method accurately captures non-rigid motion, volume changes, and even topological transitions without per-frame optimization.
arXiv Detail & Related papers (2025-10-07T17:58:11Z)
Enhancing Novel View Synthesis from extremely sparse views with SfM-free 3D Gaussian Splatting Framework [14.927184256861807]
We propose a novel SfM-free 3DGS-based method that jointly estimates camera poses and reconstructs 3D scenes from extremely sparse-view inputs.<n>Our method significantly outperforms other state-of-the-art 3DGS-based approaches, achieving a remarkable 2.75dB improvement in PSNR under extremely sparse-view conditions.
arXiv Detail & Related papers (2025-08-21T11:25:24Z)
Large-scale visual SLAM for in-the-wild videos [28.58692815339531]
We introduce a robust pipeline designed to improve 3D reconstruction from casual videos.<n>We build upon recent deep visual odometry methods but increase robustness in several ways.<n>We demonstrate large-scale contiguous 3D models from several online videos in various environments.
arXiv Detail & Related papers (2025-04-29T07:37:51Z)
Humans as a Calibration Pattern: Dynamic 3D Scene Reconstruction from Unsynchronized and Uncalibrated Videos [12.19207713016543]
Recent works on dynamic 3D neural field reconstruction assume the input from multi-view videos whose poses are known.<n>We show that unchronized setups can generate dynamic dynamic videos capture human motion.
arXiv Detail & Related papers (2024-12-26T07:04:20Z)
LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors [107.83398512719981]
Single-image 3D reconstruction remains a fundamental challenge in computer vision.<n>Recent advances in Latent Video Diffusion Models offer promising 3D priors learned from large-scale video data.<n>We propose LiftImage3D, a framework that effectively releases LVDMs' generative priors while ensuring 3D consistency.
arXiv Detail & Related papers (2024-12-12T18:58:42Z)
Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video [55.704264233274294]
We propose Deblur4DGS to reconstruct a high-quality 4D model from blurry monocular video.<n>We transform continuous dynamic representations within an exposure time into the exposure time estimation.<n>Beyond novel-view synthesis, Deblur4DGS can be applied to improve blurry video from multiple perspectives.
arXiv Detail & Related papers (2024-12-09T12:02:11Z)
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors [5.407319151576265]
We introduce a generative approach for pose-free (without camera parameters) reconstruction of 360 scenes from a sparse set of 2D images.<n>We propose an image-to-image generative model designed to inpaint missing details and remove artifacts in novel view renders and depth maps of a 3D scene.
arXiv Detail & Related papers (2024-11-24T19:34:58Z)
Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion [25.54868552979793]
We present a method that adapts to camera motion and allows high-quality scene reconstruction with handheld video data. Our results with both synthetic and real data demonstrate superior performance in mitigating camera motion over existing methods.
arXiv Detail & Related papers (2024-03-20T06:19:41Z)
SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes [75.9110646062442]
We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner. Our method takes multi-view RGB videos and background images from static cameras with known camera parameters as input. We show experimentally that, unlike prior work that only handles small motion, our method enables the reconstruction of studio-scale motions.
arXiv Detail & Related papers (2023-08-16T09:50:35Z)
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction. Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z)
Enhanced Stable View Synthesis [86.69338893753886]
We introduce an approach to enhance the novel view synthesis from images taken from a freely moving camera. The introduced approach focuses on outdoor scenes where recovering accurate geometric scaffold and camera pose is challenging.
arXiv Detail & Related papers (2023-03-30T01:53:14Z)
Online Adaptation for Consistent Mesh Reconstruction in the Wild [147.22708151409765]
We pose video-based reconstruction as a self-supervised online adaptation problem applied to any incoming test video. We demonstrate that our algorithm recovers temporally consistent and reliable 3D structures from videos of non-rigid objects including those of animals captured in the wild.
arXiv Detail & Related papers (2020-12-06T07:22:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.