Wavelet-Attention CNN for Image Classification
- URL: http://arxiv.org/abs/2201.09271v1
- Date: Sun, 23 Jan 2022 14:00:33 GMT
- Title: Wavelet-Attention CNN for Image Classification
- Authors: Zhao Xiangyu
- Abstract summary: We propose a Wavelet-Attention convolutional neural network (WA-CNN) for image classification.
WA-CNN decomposes the feature maps into low-frequency and high-frequency components for storing the structures of the basic objects.
Our proposed WA-CNN achieves significant improvements in classification accuracy compared to other related networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The feature learning methods based on convolutional neural network (CNN) have
successfully produced tremendous achievements in image classification tasks.
However, the inherent noise and some other factors may weaken the effectiveness
of the convolutional feature statistics. In this paper, we investigate Discrete
Wavelet Transform (DWT) in the frequency domain and design a new
Wavelet-Attention (WA) block to only implement attention in the high-frequency
domain. Based on this, we propose a Wavelet-Attention convolutional neural
network (WA-CNN) for image classification. Specifically, WA-CNN decomposes the
feature maps into low-frequency and high-frequency components for storing the
structures of the basic objects, as well as the detailed information and noise,
respectively. Then, the WA block is leveraged to capture the detailed
information in the high-frequency domain with different attention factors but
reserves the basic object structures in the low-frequency domain. Experimental
results on CIFAR-10 and CIFAR-100 datasets show that our proposed WA-CNN
achieves significant improvements in classification accuracy compared to other
related networks. Specifically, based on MobileNetV2 backbones, WA-CNN achieves
1.26% Top-1 accuracy improvement on the CIFAR-10 benchmark and 1.54% Top-1
accuracy improvement on the CIFAR-100 benchmark.
Related papers
- FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system.
We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain.
To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB)
We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z) - DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects [48.65846477275723]
This study proposes novel dual-current neural networks (DCNN) to improve the accuracy of fine-grained image classification.
The main novel design features for constructing a weakly supervised learning backbone model DCNN include (a) extracting heterogeneous data, (b) keeping the feature map resolution unchanged, (c) expanding the receptive field, and (d) fusing global representations and local features.
arXiv Detail & Related papers (2024-05-07T07:51:28Z) - Multi-stage image denoising with the wavelet transform [125.2251438120701]
Deep convolutional neural networks (CNNs) are used for image denoising via automatically mining accurate structure information.
We propose a multi-stage image denoising CNN with the wavelet transform (MWDCNN) via three stages, i.e., a dynamic convolutional block (DCB), two cascaded wavelet transform and enhancement blocks (WEBs) and residual block (RB)
arXiv Detail & Related papers (2022-09-26T03:28:23Z) - Synthetic Aperture Radar Image Change Detection via Layer
Attention-Based Noise-Tolerant Network [36.860069663770226]
We propose a layer attention-based noise-tolerant network, termed LANTNet.
In particular, we design a layer attention module that adaptively weights the feature of different convolution layers.
The experimental results on three SAR datasets show that the proposed LANTNet performs better than several state-of-the-art methods.
arXiv Detail & Related papers (2022-08-09T01:04:39Z) - Hierarchical Spherical CNNs with Lifting-based Adaptive Wavelets for
Pooling and Unpooling [101.72318949104627]
We propose a novel framework of hierarchical convolutional neural networks (HS-CNNs) with a lifting structure to learn adaptive spherical wavelets for pooling and unpooling.
LiftHS-CNN ensures a more efficient hierarchical feature learning for both image- and pixel-level tasks.
arXiv Detail & Related papers (2022-05-31T07:23:42Z) - WaveCNet: Wavelet Integrated CNNs to Suppress Aliasing Effect for
Noise-Robust Image Classification [41.94702591058716]
convolutional neural networks (CNNs) are prone to noise interruptions.
We try to integrate CNNs with wavelet by replacing the common down-sampling with discrete wavelet transform (DWT)
We have also tested the performance of WaveCNets on the noisy version of ImageNet, ImageNet-C and six adversarial attacks.
arXiv Detail & Related papers (2021-07-28T12:59:15Z) - Deep Networks for Direction-of-Arrival Estimation in Low SNR [89.45026632977456]
We introduce a Convolutional Neural Network (CNN) that is trained from mutli-channel data of the true array manifold matrix.
We train a CNN in the low-SNR regime to predict DoAs across all SNRs.
Our robust solution can be applied in several fields, ranging from wireless array sensors to acoustic microphones or sonars.
arXiv Detail & Related papers (2020-11-17T12:52:18Z) - Wavelet Integrated CNNs for Noise-Robust Image Classification [51.18193090255933]
We enhance CNNs by replacing max-pooling, strided-convolution, and average-pooling with Discrete Wavelet Transform (DWT)
WaveCNets, the wavelet integrated versions of VGG, ResNets, and DenseNet, achieve higher accuracy and better noise-robustness than their vanilla versions.
arXiv Detail & Related papers (2020-05-07T09:10:41Z) - Learning in the Frequency Domain [20.045740082113845]
We propose a learning-based frequency selection method to identify the trivial frequency components which can be removed without accuracy loss.
Experiment results show that learning in the frequency domain with static channel selection can achieve higher accuracy than the conventional spatial downsampling approach.
arXiv Detail & Related papers (2020-02-27T19:57:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.