Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB
Spectral Domain Translation
- URL: http://arxiv.org/abs/2312.16040v1
- Date: Tue, 26 Dec 2023 13:07:45 GMT
- Title: Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB
Spectral Domain Translation
- Authors: Xingxing Yang, Jie Chen, Zaifeng Yang
- Abstract summary: We introduce a domain translation module that translates NIR source images into the grayscale target domain.
By incorporating a progressive training strategy, the statistical and semantic knowledge from both task domains are efficiently aligned.
Experiments show that our MPFNet outperforms state-of-the-art counterparts by 2.55 dB in the NIR-to-RGB spectral domain translation task.
- Score: 6.580484964018551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: NIR-to-RGB spectral domain translation is a challenging task due to the
mapping ambiguities, and existing methods show limited learning capacities. To
address these challenges, we propose to colorize NIR images via a multi-scale
progressive feature embedding network (MPFNet), with the guidance of grayscale
image colorization. Specifically, we first introduce a domain translation
module that translates NIR source images into the grayscale target domain. By
incorporating a progressive training strategy, the statistical and semantic
knowledge from both task domains are efficiently aligned with a series of
pixel- and feature-level consistency constraints. Besides, a multi-scale
progressive feature embedding network is designed to improve learning
capabilities. Experiments show that our MPFNet outperforms state-of-the-art
counterparts by 2.55 dB in the NIR-to-RGB spectral domain translation task in
terms of PSNR.
Related papers
- Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation [0.536022165180739]
We propose a novel image-to-image translation framework, Pix2Next, to generate high-quality Near-Infrared (NIR) images from RGB inputs.
A multi-scale PatchGAN discriminator ensures realistic image generation at various detail levels, while carefully designed loss functions couple global context understanding with local feature preservation.
The proposed approach enables the scaling up of NIR datasets without additional data acquisition or annotation efforts, potentially accelerating advancements in NIR-based computer vision applications.
arXiv Detail & Related papers (2024-09-25T07:51:47Z) - Near-Infrared and Low-Rank Adaptation of Vision Transformers in Remote Sensing [3.2088888904556123]
Plant health can be monitored dynamically using multispectral sensors that measure Near-Infrared reflectance (NIR)
Despite this potential, obtaining and annotating high-resolution NIR images poses a significant challenge for training deep neural networks.
This study investigates the potential benefits of using vision transformer (ViT) backbones pre-trained in the RGB domain, with low-rank adaptation for downstream tasks in the NIR domain.
arXiv Detail & Related papers (2024-05-28T07:24:07Z) - Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation [5.596598303356484]
Existing methods fail to reconcile the tension between maintaining texture detail fidelity and achieving diverse color variations.
We propose a Multi-scale HSV Color Feature Embedding Network (MCFNet) that decomposes the mapping process into three sub-tasks.
The proposed MCFNet demonstrates substantial performance gains over the NIR image colorization task.
arXiv Detail & Related papers (2024-04-25T15:33:23Z) - You Only Need One Color Space: An Efficient Network for Low-light Image Enhancement [50.37253008333166]
Low-Light Image Enhancement (LLIE) task tends to restore the details and visual information from corrupted low-light images.
We propose a novel trainable color space, named Horizontal/Vertical-Intensity (HVI)
It not only decouples brightness and color from RGB channels to mitigate the instability during enhancement but also adapts to low-light images in different illumination ranges due to the trainable parameters.
arXiv Detail & Related papers (2024-02-08T16:47:43Z) - Cooperative Colorization: Exploring Latent Cross-Domain Priors for NIR
Image Spectrum Translation [5.28882362783108]
Near-infrared (NIR) image spectrum translation is a challenging problem with many promising applications.
We propose a cooperative learning paradigm that colorizes NIR images in parallel with another proxy grayscale colorization task.
Experiments show that our proposed cooperative learning framework produces satisfactory spectrum translation outputs with diverse colors and rich textures.
arXiv Detail & Related papers (2023-08-07T07:02:42Z) - Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal
Fusion with Depth Guidance [49.94504248096527]
We propose a Depth-Guided Outpainting Network (DGONet) to model the feature representations of different modalities.
Two components are designed to implement: 1) The Multimodal Learning Module produces unique depth and RGB feature representations from perspectives of different modal characteristics.
We specially design an additional constraint strategy consisting of Cross-modal Loss and Edge Loss to enhance ambiguous contours and expedite reliable content generation.
arXiv Detail & Related papers (2022-04-12T06:06:50Z) - TBNet:Two-Stream Boundary-aware Network for Generic Image Manipulation
Localization [49.521622399483846]
We propose a novel end-to-end two-stream boundary-aware network (abbreviated as TBNet) for generic image manipulation localization.
The proposed TBNet can significantly outperform state-of-the-art generic image manipulation localization methods in terms of both MCC and F1.
arXiv Detail & Related papers (2021-08-10T08:22:05Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - Attention-Guided NIR Image Colorization via Adaptive Fusion of Semantic
and Texture Clues [6.437931036166344]
Near infrared (NIR) imaging has been widely applied in low-light imaging scenarios.
It is difficult for human and algorithms to perceive the real scene in the colorless NIR domain.
We propose a novel Attention-based NIR image colorization framework via Adaptive Fusion of Semantic and Texture clues.
arXiv Detail & Related papers (2021-07-20T03:00:51Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.