SelfHVD: Self-Supervised Handheld Video Deblurring for Mobile Phones
- URL: http://arxiv.org/abs/2508.08605v1
- Date: Tue, 12 Aug 2025 03:38:14 GMT
- Title: SelfHVD: Self-Supervised Handheld Video Deblurring for Mobile Phones
- Authors: Honglei Xu, Zhilu Zhang, Junjie Fan, Xiaohe Wu, Wangmeng Zuo,
- Abstract summary: We propose a self-supervised method for handheld video deblurring, driven by sharp clues in the video.<n>To train the deblurring model, we extract the sharp clues from the video and take them as misalignment labels of neighboring blurry frames.<n>We construct a synthetic and a real-world handheld video dataset for handheld video deblurring.
- Score: 54.427316707517406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shooting video with a handheld mobile phone, the most common photographic device, often results in blurry frames due to shaking hands and other instability factors. Although previous video deblurring methods have achieved impressive progress, they still struggle to perform satisfactorily on real-world handheld video due to the blur domain gap between training and testing data. To address the issue, we propose a self-supervised method for handheld video deblurring, which is driven by sharp clues in the video. First, to train the deblurring model, we extract the sharp clues from the video and take them as misalignment labels of neighboring blurry frames. Second, to improve the model's ability, we propose a novel Self-Enhanced Video Deblurring (SEVD) method to create higher-quality paired video data. Third, we propose a Self-Constrained Spatial Consistency Maintenance (SCSCM) method to regularize the model, preventing position shifts between the output and input frames. Moreover, we construct a synthetic and a real-world handheld video dataset for handheld video deblurring. Extensive experiments on these two and other common real-world datasets demonstrate that our method significantly outperforms existing self-supervised ones. The code and datasets are publicly available at https://github.com/cshonglei/SelfHVD.
Related papers
- OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions [77.04071342405055]
We develop an Image-Video Transfer Mixed (IVTM) training with image editing data to enable instructive editing for the subject in the customized video.<n>We also propose a diffusion Transformer framework, OmniVCus, with two embedding mechanisms, Lottery Embedding (LE) and Temporally Aligned Embedding (TAE)<n>Our method significantly surpasses state-of-the-art methods in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2025-06-29T18:43:00Z) - Generating Fit Check Videos with a Handheld Camera [21.020454186769655]
We propose a more convenient solution that enables full-body video capture using handheld mobile devices.<n>Our approach takes as input two static photos (front and back) of you in a mirror, along with an IMU motion reference that you perform while holding your mobile phone.<n>We enable rendering into a new scene, with consistent illumination and shadows.
arXiv Detail & Related papers (2025-05-29T17:58:49Z) - ReCamMaster: Camera-Controlled Generative Rendering from A Single Video [72.42376733537925]
ReCamMaster is a camera-controlled generative video re-rendering framework.<n>It reproduces the dynamic scene of an input video at novel camera trajectories.<n>Our method also finds promising applications in video stabilization, super-resolution, and outpainting.
arXiv Detail & Related papers (2025-03-14T17:59:31Z) - Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation [20.689304579898728]
Event-based Video Frame Interpolation (EVFI) uses sparse, high-temporal-resolution event measurements as motion guidance.<n>We adapt pre-trained video diffusion models trained on internet-scale datasets to EVFI.<n>Our method outperforms existing methods and generalizes across cameras far better than existing approaches.
arXiv Detail & Related papers (2024-12-10T18:55:30Z) - WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models [132.77237314239025]
Video virtual try-on aims to generate realistic sequences that maintain garment identity and adapt to a person's pose and body shape in source videos.
Traditional image-based methods, relying on warping and blending, struggle with complex human movements and occlusions.
We reconceptualize video try-on as a process of generating videos conditioned on garment descriptions and human motion.
Our solution, WildVidFit, employs image-based controlled diffusion models for a streamlined, one-stage approach.
arXiv Detail & Related papers (2024-07-15T11:21:03Z) - AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction [88.70116693750452]
Text-guided video prediction (TVP) involves predicting the motion of future frames from the initial frame according to an instruction.
Previous TVP methods make significant breakthroughs by adapting Stable Diffusion for this task.
We introduce the Multi-Modal Large Language Model (MLLM) to predict future video states based on initial frames and text instructions.
arXiv Detail & Related papers (2024-06-10T17:02:08Z) - Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control [70.17137528953953]
Collaborative video diffusion (CVD) is trained on top of a state-of-the-art camera-control module for video generation.
CVD generates multiple videos rendered from different camera trajectories with significantly better consistency than baselines.
arXiv Detail & Related papers (2024-05-27T17:58:01Z) - Video Deblurring by Fitting to Test Data [39.41334067434719]
Motion blur in videos captured by autonomous vehicles and robots can degrade their perception capability.
We present a novel approach to video deblurring by fitting a deep network to the test video.
Our approach selects sharp frames from a video and then trains a convolutional neural network on these sharp frames.
arXiv Detail & Related papers (2020-12-09T18:49:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.