Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation
- URL: http://arxiv.org/abs/2404.16685v1
- Date: Thu, 25 Apr 2024 15:33:23 GMT
- Title: Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation
- Authors: Huiyu Zhai, Mo Chen, Xingxing Yang, Gusheng Kang,
- Abstract summary: Existing methods fail to reconcile the tension between maintaining texture detail fidelity and achieving diverse color variations.
We propose a Multi-scale HSV Color Feature Embedding Network (MCFNet) that decomposes the mapping process into three sub-tasks.
The proposed MCFNet demonstrates substantial performance gains over the NIR image colorization task.
- Score: 5.596598303356484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The NIR-to-RGB spectral domain translation is a formidable task due to the inherent spectral mapping ambiguities within NIR inputs and RGB outputs. Thus, existing methods fail to reconcile the tension between maintaining texture detail fidelity and achieving diverse color variations. In this paper, we propose a Multi-scale HSV Color Feature Embedding Network (MCFNet) that decomposes the mapping process into three sub-tasks, including NIR texture maintenance, coarse geometry reconstruction, and RGB color prediction. Thus, we propose three key modules for each corresponding sub-task: the Texture Preserving Block (TPB), the HSV Color Feature Embedding Module (HSV-CFEM), and the Geometry Reconstruction Module (GRM). These modules contribute to our MCFNet methodically tackling spectral translation through a series of escalating resolutions, progressively enriching images with color and texture fidelity in a scale-coherent fashion. The proposed MCFNet demonstrates substantial performance gains over the NIR image colorization task. Code is released at: https://github.com/AlexYangxx/MCFNet.
Related papers
- Enhancing RAW-to-sRGB with Decoupled Style Structure in Fourier Domain [27.1716081216131]
Current methods ignore the difference between cell phone RAW images and DSLR camera RGB images.
We present a novel Neural ISP framework, named FourierISP.
This approach breaks the image down into style and structure within the frequency domain, allowing for independent optimization.
arXiv Detail & Related papers (2024-01-04T09:18:31Z) - Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB
Spectral Domain Translation [6.580484964018551]
We introduce a domain translation module that translates NIR source images into the grayscale target domain.
By incorporating a progressive training strategy, the statistical and semantic knowledge from both task domains are efficiently aligned.
Experiments show that our MPFNet outperforms state-of-the-art counterparts by 2.55 dB in the NIR-to-RGB spectral domain translation task.
arXiv Detail & Related papers (2023-12-26T13:07:45Z) - Cooperative Colorization: Exploring Latent Cross-Domain Priors for NIR
Image Spectrum Translation [5.28882362783108]
Near-infrared (NIR) image spectrum translation is a challenging problem with many promising applications.
We propose a cooperative learning paradigm that colorizes NIR images in parallel with another proxy grayscale colorization task.
Experiments show that our proposed cooperative learning framework produces satisfactory spectrum translation outputs with diverse colors and rich textures.
arXiv Detail & Related papers (2023-08-07T07:02:42Z) - Spherical Space Feature Decomposition for Guided Depth Map
Super-Resolution [123.04455334124188]
Guided depth map super-resolution (GDSR) aims to upsample low-resolution (LR) depth maps with additional information involved in high-resolution (HR) RGB images from the same scene.
In this paper, we propose the Spherical Space feature Decomposition Network (SSDNet) to solve the above issues.
Our method can achieve state-of-the-art results on four test datasets, as well as successfully generalize to real-world scenes.
arXiv Detail & Related papers (2023-03-15T21:22:21Z) - Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal
Fusion with Depth Guidance [49.94504248096527]
We propose a Depth-Guided Outpainting Network (DGONet) to model the feature representations of different modalities.
Two components are designed to implement: 1) The Multimodal Learning Module produces unique depth and RGB feature representations from perspectives of different modal characteristics.
We specially design an additional constraint strategy consisting of Cross-modal Loss and Edge Loss to enhance ambiguous contours and expedite reliable content generation.
arXiv Detail & Related papers (2022-04-12T06:06:50Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - Attention-Guided NIR Image Colorization via Adaptive Fusion of Semantic
and Texture Clues [6.437931036166344]
Near infrared (NIR) imaging has been widely applied in low-light imaging scenarios.
It is difficult for human and algorithms to perceive the real scene in the colorless NIR domain.
We propose a novel Attention-based NIR image colorization framework via Adaptive Fusion of Semantic and Texture clues.
arXiv Detail & Related papers (2021-07-20T03:00:51Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Cross-Modal Weighting Network for RGB-D Salient Object Detection [76.0965123893641]
We propose a novel Cross-Modal Weighting (CMW) strategy to encourage comprehensive interactions between RGB and depth channels for RGB-D SOD.
Specifically, three RGB-depth interaction modules, named CMW-L, CMW-M and CMW-H, are developed to deal with respectively low-, middle- and high-level cross-modal information fusion.
CMWNet consistently outperforms 15 state-of-the-art RGB-D SOD methods on seven popular benchmarks.
arXiv Detail & Related papers (2020-07-09T16:01:44Z) - Fast Generation of High Fidelity RGB-D Images by Deep-Learning with
Adaptive Convolution [10.085742605397124]
We propose a deep-learning based approach to efficiently generate RGB-D images with completed information in high resolution.
As an end-to-end approach, high fidelity RGB-D images can be generated efficiently at the rate of around 21 frames per second.
arXiv Detail & Related papers (2020-02-12T16:14:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.