Histogram Assisted Quality Aware Generative Model for Resolution Invariant NIR Image Colorization
- URL: http://arxiv.org/abs/2601.01103v1
- Date: Sat, 03 Jan 2026 07:46:59 GMT
- Title: Histogram Assisted Quality Aware Generative Model for Resolution Invariant NIR Image Colorization
- Authors: Abhinav Attri, Rajeev Ranjan Dwivedi, Samiran Das, Vinod Kumar Kurmi,
- Abstract summary: We present HAQAGen, a unified generative model for resolution-invariant NIR-to-RGB colorization.<n>The proposed model introduces (i) a combined loss term aligning the global color statistics through differentiable histogram matching, perceptual image quality measure, and feature based similarity to preserve texture information.<n>We introduce an adaptive-resolution inference engine that further enables high-resolution translation without sacrificing quality.
- Score: 0.9064664319018063
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present HAQAGen, a unified generative model for resolution-invariant NIR-to-RGB colorization that balances chromatic realism with structural fidelity. The proposed model introduces (i) a combined loss term aligning the global color statistics through differentiable histogram matching, perceptual image quality measure, and feature based similarity to preserve texture information, (ii) local hue-saturation priors injected via Spatially Adaptive Denormalization (SPADE) to stabilize chromatic reconstruction, and (iii) texture-aware supervision within a Mamba backbone to preserve fine details. We introduce an adaptive-resolution inference engine that further enables high-resolution translation without sacrificing quality. Our proposed NIR-to-RGB translation model simultaneously enforces global color statistics and local chromatic consistency, while scaling to native resolutions without compromising texture fidelity or generalization. Extensive evaluations on FANVID, OMSIV, VCIP2020, and RGB2NIR using different evaluation metrics demonstrate consistent improvements over state-of-the-art baseline methods. HAQAGen produces images with sharper textures, natural colors, attaining significant gains as per perceptual metrics. These results position HAQAGen as a scalable and effective solution for NIR-to-RGB translation across diverse imaging scenarios. Project Page: https://rajeev-dw9.github.io/HAQAGen/
Related papers
- RAW-Flow: Advancing RGB-to-RAW Image Reconstruction with Deterministic Latent Flow Matching [55.03149221192589]
We introduce a novel framework named RAW-Flow to bridge the gap between RGB and RAW representations.<n>We also introduce a cross-scale context guidance module that injects hierarchical RGB features into the flow estimation process.<n> RAW-Flow outperforms state-of-the-art approaches both quantitatively and visually.
arXiv Detail & Related papers (2026-01-28T08:27:38Z) - Manifold-aware Representation Learning for Degradation-agnostic Image Restoration [135.90908995927194]
Image Restoration (IR) aims to recover high quality images from degraded inputs affected by various corruptions such as noise, blur, haze, rain, and low light conditions.<n>We present MIRAGE, a unified framework for all in one IR that explicitly decomposes the input feature space into three semantically aligned parallel branches.<n>This modular decomposition significantly improves generalization and efficiency across diverse degradations.
arXiv Detail & Related papers (2025-05-24T12:52:10Z) - Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images [64.80875911446937]
We propose a Correlation and Continuity Network (CCNet) for HSI reconstruction from RGB images.<n>For the correlation of local spectrum, we introduce the Group-wise Spectral Correlation Modeling (GrSCM) module.<n>For the continuity of global spectrum, we design the Neighborhood-wise Spectral Continuity Modeling (NeSCM) module.
arXiv Detail & Related papers (2025-01-02T15:14:40Z) - Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation [0.536022165180739]
We propose a novel image-to-image translation framework, Pix2Next, to generate high-quality Near-Infrared (NIR) images from RGB inputs.<n>A multi-scale PatchGAN discriminator ensures realistic image generation at various detail levels, while carefully designed loss functions couple global context understanding with local feature preservation.<n>The proposed approach enables the scaling up of NIR datasets without additional data acquisition or annotation efforts, potentially accelerating advancements in NIR-based computer vision applications.
arXiv Detail & Related papers (2024-09-25T07:51:47Z) - Learning Color Equivariant Representations [1.9594704501292781]
We introduce group convolutional neural networks (GCNNs) equivariant to color variation.<n>GCNNs have been designed for a variety of geometric transformations from 2D and 3D rotation groups, to semi-groups such as scale.
arXiv Detail & Related papers (2024-06-13T21:02:03Z) - IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model [7.842507196763463]
IRSRMamba is a novel framework integrating wavelet transform feature modulation for multi-scale adaptation.<n>IRSRMamba outperforms state-of-the-art methods in PSNR, SSIM, and perceptual quality.<n>This work establishes Mamba-based architectures as a promising direction for high-fidelity IR image enhancement.
arXiv Detail & Related papers (2024-05-16T07:49:24Z) - NIR-Assisted Image Denoising: A Selective Fusion Approach and A Real-World Benchmark Dataset [53.79524776100983]
Leveraging near-infrared (NIR) images to assist visible RGB image denoising shows the potential to address this issue.<n>Existing works still struggle with taking advantage of NIR information effectively for real-world image denoising.<n>We propose an efficient Selective Fusion Module (SFM), which can be plug-and-played into the advanced denoising networks.
arXiv Detail & Related papers (2024-04-12T14:54:26Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Spectral Graphormer: Spectral Graph-based Transformer for Egocentric
Two-Hand Reconstruction using Multi-View Color Images [33.70056950818641]
We propose a novel transformer-based framework that reconstructs two high fidelity hands from multi-view RGB images.
We show that our framework is able to produce realistic two-hand reconstructions and demonstrate the generalisation of synthetic-trained models to real data.
arXiv Detail & Related papers (2023-08-21T20:07:02Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.