Related papers: Enabling ISP-less Low-Power Computer Vision

Enabling ISP-less Low-Power Computer Vision

URL: http://arxiv.org/abs/2210.05451v1
Date: Tue, 11 Oct 2022 13:47:30 GMT
Title: Enabling ISP-less Low-Power Computer Vision
Authors: Gourav Datta, Zeyu Liu, Zihan Yin, Linyu Sun, Akhilesh R. Jaiswal, Peter A. Beerel
Abstract summary: We release the raw version of a large-scale benchmark for generic high-level vision tasks. For ISP-less CV systems, training on raw images result in a 7.1% increase in test accuracy. We propose an energy-efficient form of analog in-pixel demosaicing that may be coupled with in-pixel CNN computations.
Score: 4.102254385058941
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In order to deploy current computer vision (CV) models on resource-constrained low-power devices, recent works have proposed in-sensor and in-pixel computing approaches that try to partly/fully bypass the image signal processor (ISP) and yield significant bandwidth reduction between the image sensor and the CV processing unit by downsampling the activation maps in the initial convolutional neural network (CNN) layers. However, direct inference on the raw images degrades the test accuracy due to the difference in covariance of the raw images captured by the image sensors compared to the ISP-processed images used for training. Moreover, it is difficult to train deep CV models on raw images, because most (if not all) large-scale open-source datasets consist of RGB images. To mitigate this concern, we propose to invert the ISP pipeline, which can convert the RGB images of any dataset to its raw counterparts, and enable model training on raw images. We release the raw version of the COCO dataset, a large-scale benchmark for generic high-level vision tasks. For ISP-less CV systems, training on these raw images result in a 7.1% increase in test accuracy on the visual wake works (VWW) dataset compared to relying on training with traditional ISP-processed RGB datasets. To further improve the accuracy of ISP-less CV models and to increase the energy and bandwidth benefits obtained by in-sensor/in-pixel computing, we propose an energy-efficient form of analog in-pixel demosaicing that may be coupled with in-pixel CNN computations. When evaluated on raw images captured by real sensors from the PASCALRAW dataset, our approach results in a 8.1% increase in mAP. Lastly, we demonstrate a further 20.5% increase in mAP by using a novel application of few-shot learning with thirty shots each for the novel PASCALRAW dataset, constituting 3 classes.

Related papers

Towards Realistic Low-Light Image Enhancement via ISP Driven Data Modeling [61.95831392879045]
Deep neural networks (DNNs) have recently become the leading method for low-light image enhancement (LLIE) Despite significant progress, their outputs may still exhibit issues such as amplified noise, incorrect white balance, or unnatural enhancements when deployed in real world applications. A key challenge is the lack of diverse, large scale training data that captures the complexities of low-light conditions and imaging pipelines. We propose a novel image signal processing (ISP) driven data synthesis pipeline that addresses these challenges by generating unlimited paired training data.
arXiv Detail & Related papers (2025-04-16T15:53:53Z)
Keypoint Detection and Description for Raw Bayer Images [10.443350617606972]
Keypoint detection and local feature description are fundamental tasks in robotic perception, critical for applications such as SLAM, robot localization, feature matching, pose estimation, and 3D mapping. While existing methods predominantly operate on RGB images, we propose a novel network that directly processes raw images, bypassing the need for the Image Signal Processor (ISP). This work represents the first attempt to develop a keypoint detection and feature description network specifically for raw images, offering a more efficient solution for resource-constrained environments.
arXiv Detail & Related papers (2025-03-11T17:54:12Z)
Rethinking Image Super-Resolution from Training Data Perspectives [54.28824316574355]
We investigate the understudied effect of the training data used for image super-resolution (SR) With this, we propose an automated image evaluation pipeline. We find that datasets with (i) low compression artifacts, (ii) high within-image diversity as judged by the number of different objects, and (iii) a large number of images from ImageNet or PASS all positively affect SR performance.
arXiv Detail & Related papers (2024-09-01T16:25:04Z)
RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images [51.68432586065828]
We introduce RAW-Adapter, a novel approach aimed at adapting sRGB pre-trained models to camera RAW data. Raw-Adapter comprises input-level adapters that employ learnable ISP stages to adjust RAW inputs, as well as model-level adapters to build connections between ISP stages and subsequent high-level networks.
arXiv Detail & Related papers (2024-08-27T06:14:54Z)
Dual-Scale Transformer for Large-Scale Single-Pixel Imaging [11.064806978728457]
We propose a deep unfolding network with hybrid-attention Transformer on Kronecker SPI model, dubbed HATNet, to improve the imaging quality of real SPI cameras. The gradient descent module can avoid high computational overheads rooted in previous gradient descent modules based on vectorized SPI. The denoising module is an encoder-decoder architecture powered by dual-scale spatial attention for high- and low-frequency aggregation and channel attention for global information recalibration.
arXiv Detail & Related papers (2024-04-07T15:53:21Z)
Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method [51.30748775681917]
We consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution. We conduct systematic benchmarking studies and provide a comparison of current LLIE algorithms. As a second contribution, we introduce LLFormer, a transformer-based low-light enhancement method.
arXiv Detail & Related papers (2022-12-22T09:05:07Z)
Reversed Image Signal Processing and RAW Reconstruction. AIM 2022 Challenge Report [109.2135194765743]
This paper introduces the AIM 2022 Challenge on Reversed Image Signal Processing and RAW Reconstruction. We aim to recover raw sensor images from the corresponding RGBs without metadata and, by doing this, "reverse" the ISP transformation.
arXiv Detail & Related papers (2022-10-20T10:43:53Z)
LW-ISP: A Lightweight Model with ISP and Deep Learning [17.972611191715888]
We show the possibility of learning-based method to achieve real-time high-performance processing in the ISP pipeline. We propose LW-ISP, a novel architecture designed to implicitly learn the image mapping from RAW data to RGB image. Experiments demonstrate that LW-ISP has achieved a 0.38 dB improvement in PSNR compared to the previous best method.
arXiv Detail & Related papers (2022-10-08T04:00:03Z)
GenISP: Neural ISP for Low-Light Machine Cognition [19.444297600977546]
In low-light conditions, object detectors using raw image data are more robust than detectors using image data processed by an ISP pipeline. We propose a minimal neural ISP pipeline for machine cognition, named GenISP, that explicitly incorporates Color Space Transformation to a device-independent color space.
arXiv Detail & Related papers (2022-05-07T17:17:24Z)
An Empirical Study of Remote Sensing Pretraining [117.90699699469639]
We conduct an empirical study of remote sensing pretraining (RSP) on aerial images. RSP can help deliver distinctive performances in scene recognition tasks. RSP mitigates the data discrepancies of traditional ImageNet pretraining on RS images, but it may still suffer from task discrepancies.
arXiv Detail & Related papers (2022-04-06T13:38:11Z)
Toward Efficient Hyperspectral Image Processing inside Camera Pixels [1.6449390849183356]
Hyperspectral cameras generate a large amount of data due to the presence of hundreds of spectral bands. To mitigate this problem, we propose a form of processing-in-pixel (PIP) Our PIP-optimized custom CNN layers effectively compress the input data, significantly reducing the bandwidth required to transmit the data downstream to the HSI processing unit.
arXiv Detail & Related papers (2022-03-11T01:06:02Z)
Model-Based Image Signal Processors via Learnable Dictionaries [6.766416093990318]
Digital cameras transform sensor RAW readings into RGB images by means of their Image Signal Processor (ISP) Recent approaches have attempted to bridge this gap by estimating the RGB to RAW mapping. We present a novel hybrid model-based and data-driven ISP that is both learnable and interpretable.
arXiv Detail & Related papers (2022-01-10T08:36:10Z)
CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade. CNNs are capable of learning robust representations of the data directly from the RGB pixels. Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.