SEL-CIE: Knowledge-Guided Self-Supervised Learning Framework for CIE-XYZ Reconstruction from Non-Linear sRGB Images
- URL: http://arxiv.org/abs/2405.12265v1
- Date: Mon, 20 May 2024 17:20:41 GMT
- Title: SEL-CIE: Knowledge-Guided Self-Supervised Learning Framework for CIE-XYZ Reconstruction from Non-Linear sRGB Images
- Authors: Shir Barzel, Moshe Salhov, Ofir Lindenbaum, Amir Averbuch,
- Abstract summary: The CIE-XYZ color space is a device-independent linear space used as part of the camera pipeline.
Images are usually saved in non-linear states, and achieving CIE-XYZ color images using conventional methods is not always possible.
This paper proposes a framework for using SSL methods alongside paired data to reconstruct CIE-XYZ images and re-render sRGB images, outperforming existing approaches.
- Score: 7.932206255996779
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Modern cameras typically offer two types of image states: a minimally processed linear raw RGB image representing the raw sensor data, and a highly-processed non-linear image state, such as the sRGB state. The CIE-XYZ color space is a device-independent linear space used as part of the camera pipeline and can be helpful for computer vision tasks, such as image deblurring, dehazing, and color recognition tasks in medical applications, where color accuracy is important. However, images are usually saved in non-linear states, and achieving CIE-XYZ color images using conventional methods is not always possible. To tackle this issue, classical methodologies have been developed that focus on reversing the acquisition pipeline. More recently, supervised learning has been employed, using paired CIE-XYZ and sRGB representations of identical images. However, obtaining a large-scale dataset of CIE-XYZ and sRGB pairs can be challenging. To overcome this limitation and mitigate the reliance on large amounts of paired data, self-supervised learning (SSL) can be utilized as a substitute for relying solely on paired data. This paper proposes a framework for using SSL methods alongside paired data to reconstruct CIE-XYZ images and re-render sRGB images, outperforming existing approaches. The proposed framework is applied to the sRGB2XYZ dataset.
Related papers
- Leveraging Color Channel Independence for Improved Unsupervised Object Detection [7.030688465389997]
We challenge the common assumption that RGB images are the optimal color space for unsupervised learning in computer vision.
We show that models improve when requiring them to predict additional color channels.
The use of composite color spaces can be implemented with basically no computational overhead.
arXiv Detail & Related papers (2024-12-19T18:28:37Z) - Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer [10.982521876026281]
We introduce a diffusion-based framework to address the RGB-D semantic segmentation problem.
We demonstrate that utilizing a Deformable Attention Transformer as the encoder to extract features from depth images effectively captures the characteristics of invalid regions in depth measurements.
arXiv Detail & Related papers (2024-09-23T15:23:01Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Raw Image Reconstruction with Learned Compact Metadata [61.62454853089346]
We propose a novel framework to learn a compact representation in the latent space serving as the metadata in an end-to-end manner.
We show how the proposed raw image compression scheme can adaptively allocate more bits to image regions that are important from a global perspective.
arXiv Detail & Related papers (2023-02-25T05:29:45Z) - Model-Based Image Signal Processors via Learnable Dictionaries [6.766416093990318]
Digital cameras transform sensor RAW readings into RGB images by means of their Image Signal Processor (ISP)
Recent approaches have attempted to bridge this gap by estimating the RGB to RAW mapping.
We present a novel hybrid model-based and data-driven ISP that is both learnable and interpretable.
arXiv Detail & Related papers (2022-01-10T08:36:10Z) - Learning RAW-to-sRGB Mappings with Inaccurately Aligned Supervision [76.41657124981549]
This paper presents a joint learning model for image alignment and RAW-to-sRGB mapping.
Experiments show that our method performs favorably against state-of-the-arts on ZRR and SR-RAW datasets.
arXiv Detail & Related papers (2021-08-18T12:41:36Z) - Semantic-embedded Unsupervised Spectral Reconstruction from Single RGB
Images in the Wild [48.44194221801609]
We propose a new lightweight and end-to-end learning-based framework to tackle this challenge.
We progressively spread the differences between input RGB images and re-projected RGB images from recovered HS images via effective camera spectral response function estimation.
Our method significantly outperforms state-of-the-art unsupervised methods and even exceeds the latest supervised method under some settings.
arXiv Detail & Related papers (2021-08-15T05:19:44Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - CIE XYZ Net: Unprocessing Images for Low-Level Computer Vision Tasks [45.820956016608314]
Cameras currently allow access to two image states: (i.e., raw sensor data) or (ii.) a highly-processed nonlinear image state (e.g., sRGB)
There are many computer vision tasks that work best with a linear image state, such as image deblurring and image dehazing.
We propose a deep learning framework, CIE XYZ Net, that can unprocess a nonlinear image back to the canonical CIE XYZ image.
arXiv Detail & Related papers (2020-06-23T02:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.