EgoVSR: Towards High-Quality Egocentric Video Super-Resolution
- URL: http://arxiv.org/abs/2305.14708v2
- Date: Wed, 26 Jul 2023 08:44:50 GMT
- Title: EgoVSR: Towards High-Quality Egocentric Video Super-Resolution
- Authors: Yichen Chi, Junhao Gu, Jiamiao Zhang, Wenming Yang, Yapeng Tian
- Abstract summary: EgoVSR is a Video Super-Resolution framework specifically designed for egocentric videos.
We explicitly tackle motion blurs in egocentric videos using a Dual Branch Deblur Network (DB$2$Net) in the VSR framework.
An online motion blur synthesis model for common VSR training data is proposed to simulate motion blurs as in egocentric videos.
- Score: 23.50915512118989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the limitations of capture devices and scenarios, egocentric videos
frequently have low visual quality, mainly caused by high compression and
severe motion blur. With the increasing application of egocentric videos, there
is an urgent need to enhance the quality of these videos through
super-resolution. However, existing Video Super-Resolution (VSR) works,
focusing on third-person view videos, are actually unsuitable for handling
blurring artifacts caused by rapid ego-motion and object motion in egocentric
videos. To this end, we propose EgoVSR, a VSR framework specifically designed
for egocentric videos. We explicitly tackle motion blurs in egocentric videos
using a Dual Branch Deblur Network (DB$^2$Net) in the VSR framework. Meanwhile,
a blurring mask is introduced to guide the DB$^2$Net learning, and can be used
to localize blurred areas in video frames. We also design a MaskNet to predict
the mask, as well as a mask loss to optimize the mask estimation. Additionally,
an online motion blur synthesis model for common VSR training data is proposed
to simulate motion blurs as in egocentric videos. In order to validate the
effectiveness of our proposed method, we introduce an EgoVSR dataset containing
a large amount of fast-motion egocentric video sequences. Extensive experiments
demonstrate that our EgoVSR model can efficiently super-resolve low-quality
egocentric videos and outperform strong comparison baselines. Our code,
pre-trained models and data can be found at https://github.com/chiyich/EGOVSR/.
Related papers
- EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval [52.375143786641196]
EgoCVR is an evaluation benchmark for fine-grained Composed Video Retrieval.
EgoCVR consists of 2,295 queries that specifically focus on high-quality temporal video understanding.
arXiv Detail & Related papers (2024-07-23T17:19:23Z) - Hybrid Structure-from-Motion and Camera Relocalization for Enhanced Egocentric Localization [64.08563002366812]
We propose a model ensemble strategy to improve the camera pose estimation part of the VQ3D task.
The core idea is not only to do SfM for egocentric videos but also to do 2D-3D matching between existing 3D scans and 2D video frames.
Our method achieves the best performance regarding the most important metric, the overall success rate.
arXiv Detail & Related papers (2024-07-10T20:01:35Z) - EMAG: Ego-motion Aware and Generalizable 2D Hand Forecasting from Egocentric Videos [9.340890244344497]
Existing methods for forecasting 2D hand positions rely on visual representations and mainly focus on hand-object interactions.
We propose EMAG, an ego-motion-aware and generalizable 2D hand forecasting method.
Our model outperforms prior methods by 1.7% and 7.0% on intra and cross-dataset evaluations.
arXiv Detail & Related papers (2024-05-30T13:15:18Z) - Retrieval-Augmented Egocentric Video Captioning [53.2951243928289]
EgoInstructor is a retrieval-augmented multimodal captioning model that automatically retrieves semantically relevant third-person instructional videos.
We train the cross-view retrieval module with a novel EgoExoNCE loss that pulls egocentric and exocentric video features closer by aligning them to shared text features that describe similar actions.
arXiv Detail & Related papers (2024-01-01T15:31:06Z) - 3D Human Pose Perception from Egocentric Stereo Videos [67.9563319914377]
We propose a new transformer-based framework to improve egocentric stereo 3D human pose estimation.
Our method is able to accurately estimate human poses even in challenging scenarios, such as crouching and sitting.
We will release UnrealEgo2, UnrealEgo-RW, and trained models on our project page.
arXiv Detail & Related papers (2023-12-30T21:21:54Z) - EgoDistill: Egocentric Head Motion Distillation for Efficient Video
Understanding [90.9111678470214]
We propose EgoDistill, a distillation-based approach that learns to reconstruct heavy egocentric video clip features.
Our method leads to significant improvements in efficiency, requiring 200x fewer GFLOPs than equivalent video models.
We demonstrate its effectiveness on the Ego4D and EPICKitchens datasets, where our method outperforms state-of-the-art efficient video understanding methods.
arXiv Detail & Related papers (2023-01-05T18:39:23Z) - Ego-Body Pose Estimation via Ego-Head Pose Estimation [22.08240141115053]
Estimating 3D human motion from an egocentric video sequence plays a critical role in human behavior understanding and has various applications in VR/AR.
We propose a new method, Ego-Body Pose Estimation via Ego-Head Pose Estimation (EgoEgo), which decomposes the problem into two stages, connected by the head motion as an intermediate representation.
This disentanglement of head and body pose eliminates the need for training datasets with paired egocentric videos and 3D human motion.
arXiv Detail & Related papers (2022-12-09T02:25:20Z) - Egocentric Video-Language Pretraining [74.04740069230692]
Video-Language Pretraining aims to learn transferable representation to advance a wide range of video-text downstream tasks.
We exploit the recently released Ego4D dataset to pioneer Egocentric training along three directions.
We demonstrate strong performance on five egocentric downstream tasks across three datasets.
arXiv Detail & Related papers (2022-06-03T16:28:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.