Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network
- URL: http://arxiv.org/abs/2405.00244v1
- Date: Tue, 30 Apr 2024 23:29:26 GMT
- Title: Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network
- Authors: Yong Shu, Liquan Shen, Xiangyu Hu, Mengyao Li, Zihao Zhou,
- Abstract summary: Existing methods are mostly trained on synthetic datasets, which perform poorly in real scenes.
We present Real-V, a large-scale real-world benchmark dataset for HDR video reconstruction.
- Score: 16.39592423564326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As an important and practical way to obtain high dynamic range (HDR) video, HDR video reconstruction from sequences with alternating exposures is still less explored, mainly due to the lack of large-scale real-world datasets. Existing methods are mostly trained on synthetic datasets, which perform poorly in real scenes. In this work, to facilitate the development of real-world HDR video reconstruction, we present Real-HDRV, a large-scale real-world benchmark dataset for HDR video reconstruction, featuring various scenes, diverse motion patterns, and high-quality labels. Specifically, our dataset contains 500 LDRs-HDRs video pairs, comprising about 28,000 LDR frames and 4,000 HDR labels, covering daytime, nighttime, indoor, and outdoor scenes. To our best knowledge, our dataset is the largest real-world HDR video reconstruction dataset. Correspondingly, we propose an end-to-end network for HDR video reconstruction, where a novel two-stage strategy is designed to perform alignment sequentially. Specifically, the first stage performs global alignment with the adaptively estimated global offsets, reducing the difficulty of subsequent alignment. The second stage implicitly performs local alignment in a coarse-to-fine manner at the feature level using the adaptive separable convolution. Extensive experiments demonstrate that: (1) models trained on our dataset can achieve better performance on real scenes than those trained on synthetic datasets; (2) our method outperforms previous state-of-the-art methods. Our dataset is available at https://github.com/yungsyu99/Real-HDRV.
Related papers
- BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement [56.97766265018334]
This paper introduces a low-light video dataset, consisting of 40 scenes with various motion scenarios under two distinct low-lighting conditions.
We provide fully registered ground truth data captured in normal light using a programmable motorized dolly and refine it via an image-based approach for pixel-wise frame alignment across different light levels.
Our experimental results demonstrate the significance of fully registered video pairs for low-light video enhancement (LLVE) and the comprehensive evaluation shows that the models trained with our dataset outperform those trained with the existing datasets.
arXiv Detail & Related papers (2024-07-03T22:41:49Z) - HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting [76.5908492298286]
Existing HDR NVS methods are mainly based on NeRF.
They suffer from long training time and slow inference speed.
We propose a new framework, High Dynamic Range Gaussian Splatting (-GS)
arXiv Detail & Related papers (2024-05-24T00:46:58Z) - GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction [11.610543327501995]
High Dynamic Range (i.e., images and videos) has a broad range of applications.
High Dynamic Range (i.e., images and videos) has a broad range of applications.
The challenging task of reconstructing visually accurate HDR images from their Low Dynamic Range (LDR) counterparts is gaining attention in the vision research community.
arXiv Detail & Related papers (2024-03-26T16:24:42Z) - Towards Efficient SDRTV-to-HDRTV by Learning from Image Formation [51.26219245226384]
Modern displays are capable of rendering video content with high dynamic range (WCG) and wide color gamut (SDR)
The majority of available resources are still in standard dynamic range (SDR)
We define and analyze the SDRTV-to-TV task by modeling the formation of SDRTV/TV content.
Our method is primarily designed for ultra-high-definition TV content and is therefore effective and lightweight for processing 4K resolution images.
arXiv Detail & Related papers (2023-09-08T02:50:54Z) - RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image [36.17182977927645]
High dynamic range (RGB) images capture much more intensity levels than standard ones.
Current methods predominantly generate HDR images from 8-bit low dynamic range (LDR) s images that have been degraded by the camera processing pipeline.
Unlike existing methods, the core idea of this work is to incorporate more informative Raw sensor data to generate HDR images.
arXiv Detail & Related papers (2023-09-05T07:58:21Z) - HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB
Domains [23.309488653045026]
High dynamic range ( HDR) video reconstruction is attracting more and more attention due to the superior visual quality compared with those of low dynamic range (LDR) videos.
There are still no real LDR- pairs for dynamic scenes due to the difficulty in capturing LDR- frames simultaneously.
In this work, we propose to utilize a staggered sensor to capture two alternate exposure images simultaneously, which are then fused into an HDR frame in both raw and sRGB domains.
arXiv Detail & Related papers (2023-04-10T11:59:03Z) - Benchmark Dataset and Effective Inter-Frame Alignment for Real-World
Video Super-Resolution [65.20905703823965]
Video super-resolution (VSR) aiming to reconstruct a high-resolution (HR) video from its low-resolution (LR) counterpart has made tremendous progress in recent years.
It remains challenging to deploy existing VSR methods to real-world data with complex degradations.
EAVSR takes the proposed multi-layer adaptive spatial transform network (MultiAdaSTN) to refine the offsets provided by the pre-trained optical flow estimation network.
arXiv Detail & Related papers (2022-12-10T17:41:46Z) - PVDD: A Practical Video Denoising Dataset with Real-World Dynamic Scenes [56.4361151691284]
"Practical Video Denoising dataset" (PVDD) contains 200 noisy-clean dynamic video pairs in both sRGB and RAW format.
Compared with existing datasets consisting of limited motion information,PVDD covers dynamic scenes with varying natural motion.
arXiv Detail & Related papers (2022-07-04T12:30:22Z) - A Two-stage Deep Network for High Dynamic Range Image Reconstruction [0.883717274344425]
This study tackles the challenges of single-shot LDR to HDR mapping by proposing a novel two-stage deep network.
Notably, our proposed method aims to reconstruct an HDR image without knowing hardware information, including camera response function (CRF) and exposure settings.
arXiv Detail & Related papers (2021-04-19T15:19:17Z) - HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world
Benchmark Dataset [30.249052175655606]
We introduce a coarse-to-fine deep learning framework for HDR video reconstruction.
Firstly, we perform coarse alignment and pixel blending in the image space to estimate the coarse HDR video.
Secondly, we conduct more sophisticated alignment and temporal fusion in the feature space of the coarse HDR video to produce better reconstruction.
arXiv Detail & Related papers (2021-03-27T16:40:05Z) - HDR-GAN: HDR Image Reconstruction from Multi-Exposed LDR Images with
Large Motions [62.44802076971331]
We propose a novel GAN-based model, HDR-GAN, for synthesizing HDR images from multi-exposed LDR images.
By incorporating adversarial learning, our method is able to produce faithful information in the regions with missing content.
arXiv Detail & Related papers (2020-07-03T11:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.