DehazeDCT: Towards Effective Non-Homogeneous Dehazing via Deformable Convolutional Transformer
- URL: http://arxiv.org/abs/2407.05169v1
- Date: Fri, 24 May 2024 10:59:18 GMT
- Title: DehazeDCT: Towards Effective Non-Homogeneous Dehazing via Deformable Convolutional Transformer
- Authors: Wei Dong, Han Zhou, Ruiyi Wang, Xiaohong Liu, Guangtao Zhai, Jun Chen,
- Abstract summary: We introduce an innovative non-homogeneous Dehazing method via Deformable Convolutional Transformer-like architecture (DehazeDCT)
We first design a transformer-like network based on deformable convolution v4, which offers long-range dependency and adaptive spatial aggregation capabilities.
Furthermore, we leverage a lightweight Retinex-inspired transformer to achieve color correction and structure refinement.
- Score: 43.807338032286346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image dehazing, a pivotal task in low-level vision, aims to restore the visibility and detail from hazy images. Many deep learning methods with powerful representation learning capability demonstrate advanced performance on non-homogeneous dehazing, however, these methods usually struggle with processing high-resolution images (e.g., $4000 \times 6000$) due to their heavy computational demands. To address these challenges, we introduce an innovative non-homogeneous Dehazing method via Deformable Convolutional Transformer-like architecture (DehazeDCT). Specifically, we first design a transformer-like network based on deformable convolution v4, which offers long-range dependency and adaptive spatial aggregation capabilities and demonstrates faster convergence and forward speed. Furthermore, we leverage a lightweight Retinex-inspired transformer to achieve color correction and structure refinement. Extensive experiment results and highly competitive performance of our method in NTIRE 2024 Dense and Non-Homogeneous Dehazing Challenge, ranking second among all 16 submissions, demonstrate the superior capability of our proposed method. The code is available: https://github.com/movingforward100/Dehazing_R.
Related papers
- Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR)
In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks.
We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z) - Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution [6.367865391518726]
Transformer-based models have achieved remarkable results in low-level vision tasks including image super-resolution (SR)
To activate more input pixels globally, hybrid attention models have been proposed.
We employ wavelet losses to train Transformer models to improve quantitative and subjective performance.
arXiv Detail & Related papers (2024-04-17T11:25:19Z) - ViTAR: Vision Transformer with Any Resolution [80.95324692984903]
Vision Transformers experience a performance decline when processing resolutions different from those seen during training.
We introduce fuzzy positional encoding in the Vision Transformer to provide consistent positional awareness across multiple resolutions.
Our resulting model, ViTAR, demonstrates impressive adaptability, achieving 83.3% top-1 accuracy at a 1120x1120 resolution and 80.4% accuracy at a 4032x4032 resolution.
arXiv Detail & Related papers (2024-03-27T08:53:13Z) - FiT: Flexible Vision Transformer for Diffusion Model [81.85667773832279]
We present a transformer architecture specifically designed for generating images with unrestricted resolutions and aspect ratios.
Unlike traditional methods that perceive images as static-resolution grids, FiT conceptualizes images as sequences of dynamically-sized tokens.
Comprehensive experiments demonstrate the exceptional performance of FiT across a broad range of resolutions.
arXiv Detail & Related papers (2024-02-19T18:59:07Z) - LKFormer: Large Kernel Transformer for Infrared Image Super-Resolution [5.478440050117844]
We propose a potent Transformer model, termed Large Kernel Transformer (LKFormer) to capture infrared images.
This mainly employs depth-wise convolution with large kernels to execute non-local feature modeling.
We have devised a novel feed-forward network structure called Gated-Pixel Feed-Forward Network (GPFN) to augment the LKFormer's capacity to manage the information flow within the network.
arXiv Detail & Related papers (2024-01-22T11:28:24Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - AdaViT: Adaptive Vision Transformers for Efficient Image Recognition [78.07924262215181]
We introduce AdaViT, an adaptive framework that learns to derive usage policies on which patches, self-attention heads and transformer blocks to use.
Our method obtains more than 2x improvement on efficiency compared to state-of-the-art vision transformers with only 0.8% drop of accuracy.
arXiv Detail & Related papers (2021-11-30T18:57:02Z) - Toward Transformer-Based Object Detection [12.704056181392415]
Vision Transformers can be used as a backbone by a common detection task head to produce competitive COCO results.
ViT-FRCNN demonstrates several known properties associated with transformers, including large pretraining capacity and fast fine-tuning performance.
We view ViT-FRCNN as an important stepping stone toward a pure-transformer solution of complex vision tasks such as object detection.
arXiv Detail & Related papers (2020-12-17T22:33:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.