An End-to-End Real-World Camera Imaging Pipeline
- URL: http://arxiv.org/abs/2411.10773v1
- Date: Sat, 16 Nov 2024 11:19:03 GMT
- Title: An End-to-End Real-World Camera Imaging Pipeline
- Authors: Kepeng Xu, Zijia Ma, Li Xu, Gang He, Yunsong Li, Wenxin Yu, Taichu Han, Cheng Yang,
- Abstract summary: We propose an end-to-end camera imaging pipeline (RealCamNet) to enhance real-world camera imaging performance.
RealCamNet is designed for high-quality conversion from RAW to RGB and compact image compression.
Experiment results show that RealCamNet achieves the best rate-distortion performance with lower inference latency.
- Score: 26.595914212462183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in neural camera imaging pipelines have demonstrated notable progress. Nevertheless, the real-world imaging pipeline still faces challenges including the lack of joint optimization in system components, computational redundancies, and optical distortions such as lens shading.In light of this, we propose an end-to-end camera imaging pipeline (RealCamNet) to enhance real-world camera imaging performance. Our methodology diverges from conventional, fragmented multi-stage image signal processing towards end-to-end architecture. This architecture facilitates joint optimization across the full pipeline and the restoration of coordinate-biased distortions. RealCamNet is designed for high-quality conversion from RAW to RGB and compact image compression. Specifically, we deeply analyze coordinate-dependent optical distortions, e.g., vignetting and dark shading, and design a novel Coordinate-Aware Distortion Restoration (CADR) module to restore coordinate-biased distortions. Furthermore, we propose a Coordinate-Independent Mapping Compression (CIMC) module to implement tone mapping and redundant information compression. Existing datasets suffer from misalignment and overly idealized conditions, making them inadequate for training real-world imaging pipelines. Therefore, we collected a real-world imaging dataset. Experiment results show that RealCamNet achieves the best rate-distortion performance with lower inference latency.
Related papers
- Towards Realistic Low-Light Image Enhancement via ISP Driven Data Modeling [61.95831392879045]
Deep neural networks (DNNs) have recently become the leading method for low-light image enhancement (LLIE)
Despite significant progress, their outputs may still exhibit issues such as amplified noise, incorrect white balance, or unnatural enhancements when deployed in real world applications.
A key challenge is the lack of diverse, large scale training data that captures the complexities of low-light conditions and imaging pipelines.
We propose a novel image signal processing (ISP) driven data synthesis pipeline that addresses these challenges by generating unlimited paired training data.
arXiv Detail & Related papers (2025-04-16T15:53:53Z) - Deblur Gaussian Splatting SLAM [57.35366732452066]
Deblur-SLAM is a robust RGB SLAM pipeline designed to recover sharp reconstructions from motion-blurred inputs.
We model the physical image formation process of motion-blurred images and minimize the error between the observed blurry images and rendered blurry images.
We achieve state-of-the-art results for sharp map estimation and sub-frame trajectory recovery both on synthetic and real-world blurry input data.
arXiv Detail & Related papers (2025-03-16T16:59:51Z) - Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction [30.529707438964596]
We present a self-calibrating framework that jointly optimize camera parameters, lens distortion and 3D Gaussian representations.
Our technique enables high-quality scene reconstruction from Large field-of-view (FOV) imagery taken with wide-angle lenses, allowing the scene to be modeled from a smaller number of images.
arXiv Detail & Related papers (2025-02-13T18:15:10Z) - Rethinking High-speed Image Reconstruction Framework with Spike Camera [48.627095354244204]
Spike cameras generate continuous spike streams to capture high-speed scenes with lower bandwidth and higher dynamic range than traditional RGB cameras.
We introduce a novel spike-to-image reconstruction framework SpikeCLIP that goes beyond traditional training paradigms.
Our experiments on real-world low-light datasets demonstrate that SpikeCLIP significantly enhances texture details and the luminance balance of recovered images.
arXiv Detail & Related papers (2025-01-08T13:00:17Z) - Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution [55.9977636042469]
Bit-depth compression produces a uniform depth representation in regions with subtle variations, hindering the recovery of detailed information.
densely distributed random noise reduces the accuracy of estimating the global geometric structure of the scene.
We propose a novel framework, termed geometry-decoupled network (GDNet), for compressed depth map super-resolution.
arXiv Detail & Related papers (2024-11-05T16:37:30Z) - CT-NeRF: Incremental Optimizing Neural Radiance Field and Poses with Complex Trajectory [12.460959809597213]
We propose CT-NeRF, an incremental reconstruction optimization pipeline using only RGB images without pose and depth input.
We evaluate the performance of CT-NeRF on two real-world datasets, NeRFBuster and Free-Dataset.
arXiv Detail & Related papers (2024-04-22T06:07:06Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - An Asynchronous Linear Filter Architecture for Hybrid Event-Frame Cameras [9.69495347826584]
We present an asynchronous linear filter architecture, fusing event and frame camera data, for HDR video reconstruction and spatial convolution.
The proposed AKF pipeline outperforms other state-of-the-art methods in both absolute intensity error (69.4% reduction) and image similarity indexes (average 35.5% improvement)
arXiv Detail & Related papers (2023-09-03T12:37:59Z) - Single Image LDR to HDR Conversion using Conditional Diffusion [18.466814193413487]
Digital imaging aims to replicate realistic scenes, but Low Dynamic Range (LDR) cameras cannot represent the wide dynamic range of real scenes.
This paper presents a deep learning-based approach for recovering intricate details from shadows and highlights.
We incorporate a deep-based autoencoder in our proposed framework to enhance the quality of the latent representation of LDR image used for conditioning.
arXiv Detail & Related papers (2023-07-06T07:19:47Z) - Learning Adaptive Warping for Real-World Rolling Shutter Correction [52.45689075940234]
This paper proposes the first real-world rolling shutter (RS) correction dataset, BS-RSC, and a corresponding model to correct the RS frames in a distorted video.
Mobile devices in the consumer market with CMOS-based sensors for video capture often result in rolling shutter effects when relative movements occur during the video acquisition process.
arXiv Detail & Related papers (2022-04-29T05:13:50Z) - RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional
Neural Network [23.451063587138393]
We propose a Raw Burst Super-Resolution Iterative Convolutional Neural Network (RBSRICNN)
The proposed network produces the final output by an iterative refinement of the intermediate SR estimates.
We demonstrate the effectiveness of our proposed approach in quantitative and qualitative experiments.
arXiv Detail & Related papers (2021-10-25T19:01:28Z) - Frequency Consistent Adaptation for Real World Super Resolution [64.91914552787668]
We propose a novel Frequency Consistent Adaptation (FCA) that ensures the frequency domain consistency when applying Super-Resolution (SR) methods to the real scene.
We estimate degradation kernels from unsupervised images and generate the corresponding Low-Resolution (LR) images.
Based on the domain-consistent LR-HR pairs, we train easy-implemented Convolutional Neural Network (CNN) SR models.
arXiv Detail & Related papers (2020-12-18T08:25:39Z) - Single-Image HDR Reconstruction by Learning to Reverse the Camera
Pipeline [100.5353614588565]
We propose to incorporate the domain knowledge of the LDR image formation pipeline into our model.
We model the HDRto-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization.
We demonstrate that the proposed method performs favorably against state-of-the-art single-image HDR reconstruction algorithms.
arXiv Detail & Related papers (2020-04-02T17:59:04Z) - EventSR: From Asynchronous Events to Image Reconstruction, Restoration,
and Super-Resolution via End-to-End Adversarial Learning [75.17497166510083]
Event cameras sense intensity changes and have many advantages over conventional cameras.
Some methods have been proposed to reconstruct intensity images from event streams.
The outputs are still in low resolution (LR), noisy, and unrealistic.
We propose a novel end-to-end pipeline that reconstructs LR images from event streams, enhances the image qualities and upsamples the enhanced images, called EventSR.
arXiv Detail & Related papers (2020-03-17T10:58:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.