DarkVision: A Benchmark for Low-light Image/Video Perception
- URL: http://arxiv.org/abs/2301.06269v1
- Date: Mon, 16 Jan 2023 05:55:59 GMT
- Title: DarkVision: A Benchmark for Low-light Image/Video Perception
- Authors: Bo Zhang, Yuchen Guo, Runzhao Yang, Zhihong Zhang, Jiayi Xie, Jinli
Suo and Qionghai Dai
- Abstract summary: We contribute the first multi-illuminance, multi-camera, and low-light dataset, named DarkVision, for both image enhancement and object detection.
The dataset consists of bright-dark pairs of 900 static scenes with objects from 15 categories, and 32 dynamic scenes with 4-category objects.
For each scene, images/videos were captured at 5 illuminance levels using three cameras of different grades, and average photons can be reliably estimated.
- Score: 44.94878263751042
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Imaging and perception in photon-limited scenarios is necessary for various
applications, e.g., night surveillance or photography, high-speed photography,
and autonomous driving. In these cases, cameras suffer from low signal-to-noise
ratio, which degrades the image quality severely and poses challenges for
downstream high-level vision tasks like object detection and recognition.
Data-driven methods have achieved enormous success in both image restoration
and high-level vision tasks. However, the lack of high-quality benchmark
dataset with task-specific accurate annotations for photon-limited
images/videos delays the research progress heavily. In this paper, we
contribute the first multi-illuminance, multi-camera, and low-light dataset,
named DarkVision, serving for both image enhancement and object detection. We
provide bright and dark pairs with pixel-wise registration, in which the bright
counterpart provides reliable reference for restoration and annotation. The
dataset consists of bright-dark pairs of 900 static scenes with objects from 15
categories, and 32 dynamic scenes with 4-category objects. For each scene,
images/videos were captured at 5 illuminance levels using three cameras of
different grades, and average photons can be reliably estimated from the
calibration data for quantitative studies. The static-scene images and dynamic
videos respectively contain around 7,344 and 320,667 instances in total. With
DarkVision, we established baselines for image/video enhancement and object
detection by representative algorithms. To demonstrate an exemplary application
of DarkVision, we propose two simple yet effective approaches for improving
performance in video enhancement and object detection respectively. We believe
DarkVision would advance the state-of-the-arts in both imaging and related
computer vision tasks in low-light environment.
Related papers
- BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events [72.25918104830252]
We propose BlinkVision, a large-scale and diverse benchmark with multiple modalities and dense correspondence annotations.
BlinkVision delivers photorealistic data and covers various naturalistic factors, such as camera shake and deformation.
It enables extensive benchmarks on three types of correspondence tasks (optical flow, point tracking, and scene flow estimation) for both image-based and event-based methods.
arXiv Detail & Related papers (2024-10-27T13:59:21Z) - HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision [16.432164340779266]
We introduce the HUE dataset, a collection of high-resolution event and frame sequences captured in low-light conditions.
Our dataset includes 106 sequences, encompassing indoor, cityscape, twilight, night, driving, and controlled scenarios.
We employ both qualitative and quantitative evaluations to assess state-of-the-art low-light enhancement and event-based image reconstruction methods.
arXiv Detail & Related papers (2024-10-24T21:15:15Z) - PIV3CAMS: a multi-camera dataset for multiple computer vision problems and its application to novel view-point synthesis [120.4361056355332]
This thesis introduces Paired Image and Video data from three CAMeraS, namely PIV3CAMS.
The PIV3CAMS dataset consists of 8385 pairs of images and 82 pairs of videos taken from three different cameras.
In addition to the regeneration of a current state-of-the-art algorithm, we investigate several proposed alternative models that integrate depth information geometrically.
arXiv Detail & Related papers (2024-07-26T12:18:29Z) - BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement [56.97766265018334]
This paper introduces a low-light video dataset, consisting of 40 scenes with various motion scenarios under two distinct low-lighting conditions.
We provide fully registered ground truth data captured in normal light using a programmable motorized dolly and refine it via an image-based approach for pixel-wise frame alignment across different light levels.
Our experimental results demonstrate the significance of fully registered video pairs for low-light video enhancement (LLVE) and the comprehensive evaluation shows that the models trained with our dataset outperform those trained with the existing datasets.
arXiv Detail & Related papers (2024-07-03T22:41:49Z) - Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - BVI-Lowlight: Fully Registered Benchmark Dataset for Low-Light Video Enhancement [44.1973928137492]
This paper introduces a novel low-light video dataset, consisting of 40 scenes in various motion scenarios under two low-lighting conditions.
We provide fully registered ground truth data captured in normal light using a programmable motorized dolly.
We refine them via image-based post-processing to ensure the pixel-wise alignment of frames in different light levels.
arXiv Detail & Related papers (2024-02-03T00:40:22Z) - Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks [55.81577205593956]
Event cameras are bio-inspired sensors that capture the per-pixel intensity changes asynchronously.
Deep learning (DL) has been brought to this emerging field and inspired active research endeavors in mining its potential.
arXiv Detail & Related papers (2023-02-17T14:19:28Z) - LIGHTS: LIGHT Specularity Dataset for specular detection in Multi-view [12.612981566441908]
We propose a novel physically-based rendered LIGHT Specularity (SLIGHT) dataset for the evaluation of the specular highlight detection task.
Our dataset consists of 18 high quality architectural scenes, where each scene is rendered with multiple views.
In total we have 2,603 views with an average of 145 views per scene.
arXiv Detail & Related papers (2021-01-26T13:26:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.