Related papers: EfficienT-HDR: An Efficient Transformer-Based Framework via Multi-Exposure Fusion for HDR Reconstruction

EfficienT-HDR: An Efficient Transformer-Based Framework via Multi-Exposure Fusion for HDR Reconstruction

URL: http://arxiv.org/abs/2509.19779v1
Date: Wed, 24 Sep 2025 06:01:37 GMT
Title: EfficienT-HDR: An Efficient Transformer-Based Framework via Multi-Exposure Fusion for HDR Reconstruction
Authors: Yu-Shen Huang, Tzu-Han Chen, Cheng-Yen Hsiao, Shaou-Gang Miaou,
Abstract summary: This study proposes a light-weight Vision Transformer architecture designed explicitly for HDR reconstruction.<n>It employs an Intersection-Aware Adaptive Fusion module to suppress ghosting effectively.<n> Experimental results demonstrate that, compared to the baseline, the main version reduces FLOPS by approximately 67%.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Achieving high-quality High Dynamic Range (HDR) imaging on resource-constrained edge devices is a critical challenge in computer vision, as its performance directly impacts downstream tasks such as intelligent surveillance and autonomous driving. Multi-Exposure Fusion (MEF) is a mainstream technique to achieve this goal; however, existing methods generally face the dual bottlenecks of high computational costs and ghosting artifacts, hindering their widespread deployment. To this end, this study proposes a light-weight Vision Transformer architecture designed explicitly for HDR reconstruction to overcome these limitations. This study is based on the Context-Aware Vision Transformer and begins by converting input images to the YCbCr color space to separate luminance and chrominance information. It then employs an Intersection-Aware Adaptive Fusion (IAAF) module to suppress ghosting effectively. To further achieve a light-weight design, we introduce Inverted Residual Embedding (IRE), Dynamic Tanh (DyT), and propose Enhanced Multi-Scale Dilated Convolution (E-MSDC) to reduce computational complexity at multiple levels. Our study ultimately contributes two model versions: a main version for high visual quality and a light-weight version with advantages in computational efficiency, both of which achieve an excellent balance between performance and image quality. Experimental results demonstrate that, compared to the baseline, the main version reduces FLOPS by approximately 67% and increases inference speed by more than fivefold on CPU and 2.5 times on an edge device. These results confirm that our method provides an efficient and ghost-free HDR imaging solution for edge devices, demonstrating versatility and practicality across various dynamic scenarios.

Related papers

HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking [80.07224739976911]
Event cameras offer exceptional temporal resolution and a range (modal)<n> RGB cameras excel at capturing rich texture with high resolution, whereas event cameras offer exceptional temporal resolution and a range (modal)
arXiv Detail & Related papers (2025-10-22T13:15:13Z)
High-resolution Photo Enhancement in Real-time: A Laplacian Pyramid Network [73.19214585791268]
This paper introduces a pyramid network called LLF-LUT++, which integrates global and local operators through closed-form Laplacian pyramid decomposition and reconstruction.<n>Specifically, we utilize an image-adaptive 3D LUT that capitalizes on the global tonal characteristics of downsampled images.<n>LLF-LUT++ not only achieves a 2.64 dB improvement in PSNR on the HDR+ dataset, but also further reduces, with 4K resolution images processed in just 13 ms on a single GPU.
arXiv Detail & Related papers (2025-10-13T16:52:32Z)
Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark [58.61079960074608]
Existing infrared image enhancement methods focus on tackling individual degradations.<n>All-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness.
arXiv Detail & Related papers (2025-10-10T12:55:54Z)
Learned Off-aperture Encoding for Wide Field-of-view RGBD Imaging [31.931929519577402]
This work explores an additional design choice by positioning a DOE off-aperture, enabling a spatial unmixing of the degrees of freedom.<n> Experimental results reveal that the off-aperture DOE enhances the imaging quality by over 5 dB in PSNR at a FoV of approximately $45circ$ when paired with a simple thin lens.
arXiv Detail & Related papers (2025-07-30T09:49:47Z)
Bidirectional Image-Event Guided Fusion Framework for Low-Light Image Enhancement [24.5584423318892]
Under extreme low-light conditions, frame-based cameras suffer from severe detail loss due to limited dynamic range.<n>Recent studies have introduced event cameras for event-guided low-light image enhancement.<n>We propose BiLIE, a Bidirectional image-event guided fusion framework for Low-Light Image Enhancement.
arXiv Detail & Related papers (2025-06-06T14:28:17Z)
Autoregressive High-Order Finite Difference Modulo Imaging: High-Dynamic Range for Computer Vision Applications [3.4956406636452626]
High dynamic range (gressive) imaging is vital for capturing the full range of light tones in scenes, essential for computer vision tasks such as autonomous driving.<n>Standard commercial imaging systems face limitations in capacity for well depth, and quantization precision, hindering their HDR capabilities.<n>We develop a modulo analog-to-digital approach that resets signals upon saturation, enabling estimation of pixel resets through neighboring pixel intensities.
arXiv Detail & Related papers (2025-04-05T16:41:15Z)
PASTA: Towards Flexible and Efficient HDR Imaging Via Progressively Aggregated Spatio-Temporal Alignment [91.38256332633544]
PASTA is a Progressively Aggregated Spatio-Temporal Alignment framework for HDR deghosting. Our approach achieves effectiveness and efficiency by harnessing hierarchical representation during feature distanglement. Experimental results showcase PASTA's superiority over current SOTA methods in both visual quality and performance metrics.
arXiv Detail & Related papers (2024-03-15T15:05:29Z)
HDRTransDC: High Dynamic Range Image Reconstruction with Transformer Deformation Convolution [21.870772317331447]
High Dynamic Range (CAM) imaging aims to generate an artifact-free HDR image with realistic details by fusing multi-exposure Low Dynamic Range (LDR) images. For the purpose of eliminating fusion distortions, we propose DWFB to spatially adaptively select useful information across frames.
arXiv Detail & Related papers (2024-03-11T15:48:17Z)
Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion [60.221404321514086]
Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels. This paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions.
arXiv Detail & Related papers (2023-09-03T08:07:26Z)
Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference. This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion. The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z)
Scale-aware Two-stage High Dynamic Range Imaging [13.587403084724015]
We propose a scale-aware two-stage high range imaging framework (ST) to generate high-quality ghost-free image composition. Specifically, our framework consists of feature alignment and two-stage fusion. In the first stage of feature fusion, we obtain a preliminary result with little ghost artifacts. In the second stage, we validate the effectiveness of the proposed ST in terms of speed and quality.
arXiv Detail & Related papers (2023-03-12T05:17:24Z)
Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution [51.274657266928315]
We propose a PSF aware plug-and-play deep network, which takes the aberrant image and PSF map as input and produces the latent high quality version via incorporating lens-specific deep priors. Specifically, we pre-train a base model from a set of diverse lenses and then adapt it to a given lens by quickly refining the parameters.
arXiv Detail & Related papers (2021-04-07T12:00:38Z)
Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline [100.5353614588565]
We propose to incorporate the domain knowledge of the LDR image formation pipeline into our model. We model the HDRto-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization. We demonstrate that the proposed method performs favorably against state-of-the-art single-image HDR reconstruction algorithms.
arXiv Detail & Related papers (2020-04-02T17:59:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.