Related papers: Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables

Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables

URL: http://arxiv.org/abs/2503.23793v1
Date: Mon, 31 Mar 2025 07:13:59 GMT
Title: Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables
Authors: Zhongnan Cai, Yingying Wang, Yunlong Lin, Hui Zheng, Ge Meng, Zixu Lin, Jiaxin Xie, Junbin Lu, Yue Huang, Xinghao Ding,
Abstract summary: We propose Pan-LUT, a learnable look-up table framework for pan-sharpening.<n>Pan-LUT balances performance and computational efficiency for high-resolution remote sensing images.<n>Our proposed method contains fewer than 300K parameters and processes a 8K resolution image in under 1 ms.
Score: 32.23794092167474
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, deep learning-based pan-sharpening algorithms have achieved notable advancements over traditional methods. However, many deep learning-based approaches incur substantial computational overhead during inference, especially with high-resolution images. This excessive computational demand limits the applicability of these methods in real-world scenarios, particularly in the absence of dedicated computing devices such as GPUs and TPUs. To address these challenges, we propose Pan-LUT, a novel learnable look-up table (LUT) framework for pan-sharpening that strikes a balance between performance and computational efficiency for high-resolution remote sensing images. To finely control the spectral transformation, we devise the PAN-guided look-up table (PGLUT) for channel-wise spectral mapping. To effectively capture fine-grained spatial details and adaptively learn local contexts, we introduce the spatial details look-up table (SDLUT) and adaptive aggregation look-up table (AALUT). Our proposed method contains fewer than 300K parameters and processes a 8K resolution image in under 1 ms using a single NVIDIA GeForce RTX 2080 Ti GPU, demonstrating significantly faster performance compared to other methods. Experiments reveal that Pan-LUT efficiently processes large remote sensing images in a lightweight manner, bridging the gap to real-world applications. Furthermore, our model surpasses SOTA methods in full-resolution scenes under real-world conditions, highlighting its effectiveness and efficiency.

Related papers

CAT: A Conditional Adaptation Tailor for Efficient and Effective Instance-Specific Pansharpening on Real-World Data [7.471505633354803]
We propose an efficient framework that adapts to a specific input instance, completing both training and inference in a short time. Our method achieves state-of-the-art performance on cross-sensor real-world data, while achieving both training and inference of $512times512$ image within $textit0.4 seconds$.
arXiv Detail & Related papers (2025-04-14T14:04:55Z)
PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification [6.3286311412189304]
We propose an efficient approach that leverages polygonal representations of images using dominant points or contour coordinates. Our method significantly reduces computational requirements, accelerates training, and conserves resources. Experiments on benchmark datasets validate the effectiveness of our approach in reducing complexity, improving generalization, and facilitating edge computing applications.
arXiv Detail & Related papers (2025-04-01T22:05:00Z)
Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image Enhancement [50.93686436282772]
We aim to delve into the limits of image enhancers both from visual quality and computational efficiency.<n>By rethinking the task demands, we build an explicit connection, i.e., visual quality and computational efficiency are corresponding to model learning and structure design.<n>Ultimately, this achieves efficient low-light image enhancement using only a single convolutional layer, while maintaining excellent visual quality.
arXiv Detail & Related papers (2025-02-27T08:20:03Z)
FLARES: Fast and Accurate LiDAR Multi-Range Semantic Segmentation [52.89847760590189]
3D scene understanding is a critical yet challenging task in autonomous driving.<n>Recent methods leverage the range-view representation to improve processing efficiency.<n>We re-design the workflow for range-view-based LiDAR semantic segmentation.
arXiv Detail & Related papers (2025-02-13T12:39:26Z)
Multi-Head Attention Residual Unfolded Network for Model-Based Pansharpening [2.874893537471256]
Unfolding fusion methods integrate the powerful representation capabilities of deep learning with the robustness of model-based approaches. In this paper, we propose a model-based deep unfolded method for satellite image fusion. Experimental results on PRISMA, Quickbird, and WorldView2 datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2024-09-04T13:05:00Z)
Efficient Visual State Space Model for Image Deblurring [99.54894198086852]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.<n>We propose a simple yet effective visual state space model (EVSSM) for image deblurring.<n>The proposed EVSSM performs favorably against state-of-the-art methods on benchmark datasets and real-world images.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
EPNet: An Efficient Pyramid Network for Enhanced Single-Image Super-Resolution with Reduced Computational Requirements [12.439807086123983]
Single-image super-resolution (SISR) has seen significant advancements through the integration of deep learning. This paper introduces a new Efficient Pyramid Network (EPNet) that harmoniously merges an Edge Split Pyramid Module (ESPM) with a Panoramic Feature Extraction Module (PFEM) to overcome the limitations of existing methods.
arXiv Detail & Related papers (2023-12-20T19:56:53Z)
Low-Resolution Self-Attention for Semantic Segmentation [93.30597515880079]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost. Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution. We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z)
Adaptive Multi-NeRF: Exploit Efficient Parallelism in Adaptive Multiple Scale Neural Radiance Field Rendering [3.8200916793910973]
Recent advances in Neural Radiance Fields (NeRF) have demonstrated significant potential for representing 3D scene appearances as implicit neural networks. However, the lengthy training and rendering process hinders the widespread adoption of this promising technique for real-time rendering applications. We present an effective adaptive multi-NeRF method designed to accelerate the neural rendering process for large scenes.
arXiv Detail & Related papers (2023-10-03T08:34:49Z)
Pixel Adapter: A Graph-Based Post-Processing Approach for Scene Text Image Super-Resolution [22.60056946339325]
We propose the Pixel Adapter Module (PAM) based on graph attention to address pixel distortion caused by upsampling. The PAM effectively captures local structural information by allowing each pixel to interact with its neighbors and update features. We demonstrate that our proposed method generates high-quality super-resolution images, surpassing existing methods in recognition accuracy.
arXiv Detail & Related papers (2023-09-16T08:12:12Z)
Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method [51.30748775681917]
We consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution. We conduct systematic benchmarking studies and provide a comparison of current LLIE algorithms. As a second contribution, we introduce LLFormer, a transformer-based low-light enhancement method.
arXiv Detail & Related papers (2022-12-22T09:05:07Z)
Panoptic SwiftNet: Pyramidal Fusion for Real-time Panoptic Segmentation [0.0]
Many applications require fast inference over large input resolutions on affordable or even embedded hardware. We propose to achieve this goal by trading off backbone capacity for multi-scale feature extraction. We present panoptic experiments on Cityscapes, Vistas, COCO and the BSB-Aerial dataset.
arXiv Detail & Related papers (2022-03-15T13:47:40Z)
Image-specific Convolutional Kernel Modulation for Single Image Super-resolution [85.09413241502209]
In this issue, we propose a novel image-specific convolutional modulation kernel (IKM) We exploit the global contextual information of image or feature to generate an attention weight for adaptively modulating the convolutional kernels. Experiments on single image super-resolution show that the proposed methods achieve superior performances over state-of-the-art methods.
arXiv Detail & Related papers (2021-11-16T11:05:10Z)
GridMask Data Augmentation [76.79300104795966]
We propose a novel data augmentation method GridMask' in this paper. It utilizes information removal to achieve state-of-the-art results in a variety of computer vision tasks.
arXiv Detail & Related papers (2020-01-13T07:27:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.