Practical cross-sensor color constancy using a dual-mapping strategy
- URL: http://arxiv.org/abs/2311.11773v1
- Date: Mon, 20 Nov 2023 13:58:59 GMT
- Title: Practical cross-sensor color constancy using a dual-mapping strategy
- Authors: Shuwei Yue and Minchen Wei
- Abstract summary: The proposed method uses a dual-mapping strategy and only requires a simple white point from a test sensor under a D65 condition.
In the second mapping phase, we transform the re-constructed image data into sparse features, which are then optimized with a lightweight multi-layer perceptron (MLP) model.
This approach effectively reduces sensor discrepancies and delivers performance on par with leading cross-sensor methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) have been widely used for illumination
estimation, which is time-consuming and requires sensor-specific data
collection. Our proposed method uses a dual-mapping strategy and only requires
a simple white point from a test sensor under a D65 condition. This allows us
to derive a mapping matrix, enabling the reconstructions of image data and
illuminants. In the second mapping phase, we transform the re-constructed image
data into sparse features, which are then optimized with a lightweight
multi-layer perceptron (MLP) model using the re-constructed illuminants as
ground truths. This approach effectively reduces sensor discrepancies and
delivers performance on par with leading cross-sensor methods. It only requires
a small amount of memory (~0.003 MB), and takes ~1 hour training on an
RTX3070Ti GPU. More importantly, the method can be implemented very fast, with
~0.3 ms and ~1 ms on a GPU or CPU respectively, and is not sensitive to the
input image resolution. Therefore, it offers a practical solution to the great
challenges of data recollection that is faced by the industry.
Related papers
- Multimodal Object Detection using Depth and Image Data for Manufacturing Parts [1.0819408603463427]
This work proposes a multi-sensor system combining an red-green-blue (RGB) camera and a 3D point cloud sensor.
A novel multimodal object detection method is developed to process both RGB and depth data.
The results show that the multimodal model significantly outperforms the depth-only and RGB-only baselines on established object detection metrics.
arXiv Detail & Related papers (2024-11-13T22:43:15Z) - bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction [57.199618102578576]
We propose bit2bit, a new method for reconstructing high-quality image stacks at original resolution from sparse binary quantatemporal image data.
Inspired by recent work on Poisson denoising, we developed an algorithm that creates a dense image sequence from sparse binary photon data.
We present a novel dataset containing a wide range of real SPAD high-speed videos under various challenging imaging conditions.
arXiv Detail & Related papers (2024-10-30T17:30:35Z) - Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - GenISP: Neural ISP for Low-Light Machine Cognition [19.444297600977546]
In low-light conditions, object detectors using raw image data are more robust than detectors using image data processed by an ISP pipeline.
We propose a minimal neural ISP pipeline for machine cognition, named GenISP, that explicitly incorporates Color Space Transformation to a device-independent color space.
arXiv Detail & Related papers (2022-05-07T17:17:24Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z) - Facial Depth and Normal Estimation using Single Dual-Pixel Camera [81.02680586859105]
We introduce a DP-oriented Depth/Normal network that reconstructs the 3D facial geometry.
It contains the corresponding ground-truth 3D models including depth map and surface normal in metric scale.
It achieves state-of-the-art performances over recent DP-based depth/normal estimation methods.
arXiv Detail & Related papers (2021-11-25T05:59:27Z) - High-speed object detection with a single-photon time-of-flight image
sensor [2.648554238948439]
We present results from a portable SPAD camera system that outputs 16-bin photon timing histograms with 64x32 spatial resolution.
The results are relevant for safety-critical computer vision applications which would benefit from faster than human reaction times.
arXiv Detail & Related papers (2021-07-28T14:53:44Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.