Frequency Domain Convolutional Neural Network: Accelerated CNN for Large
Diabetic Retinopathy Image Classification
- URL: http://arxiv.org/abs/2106.12736v1
- Date: Thu, 24 Jun 2021 02:52:54 GMT
- Title: Frequency Domain Convolutional Neural Network: Accelerated CNN for Large
Diabetic Retinopathy Image Classification
- Authors: Ee Fey Goh, ZhiYuan Chen and Wei Xiang Lim
- Abstract summary: The image size of 256x256 pixels is too small for applications like Diabetic Retinopathy (DR) classification.
This research proposed Frequency Domain Convolution (FDC) and Frequency Domain Pooling (FDP) layers.
FDC and FDP layers are used to build a Frequency Domain Convolutional Neural Network (FDCNN) to accelerate the training of large images for DR classification.
- Score: 1.1852751647387592
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The conventional spatial convolution layers in the Convolutional Neural
Networks (CNNs) are computationally expensive at the point where the training
time could take days unless the number of layers, the number of training images
or the size of the training images are reduced. The image size of 256x256
pixels is commonly used for most of the applications of CNN, but this image
size is too small for applications like Diabetic Retinopathy (DR)
classification where the image details are important for accurate
classification. This research proposed Frequency Domain Convolution (FDC) and
Frequency Domain Pooling (FDP) layers which were built with RFFT, kernel
initialization strategy, convolution artifact removal and Channel Independent
Convolution (CIC) to replace the conventional convolution and pooling layers.
The FDC and FDP layers are used to build a Frequency Domain Convolutional
Neural Network (FDCNN) to accelerate the training of large images for DR
classification. The Full FDC layer is an extension of the FDC layer to allow
direct use in conventional CNNs, it is also used to modify the VGG16
architecture. FDCNN is shown to be at least 54.21% faster and 70.74% more
memory efficient compared to an equivalent CNN architecture. The modified VGG16
architecture with Full FDC layer is reported to achieve a shorter training time
and a higher accuracy at 95.63% compared to the original VGG16 architecture for
DR classification.
Related papers
- LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Image Reconstruction for Accelerated MR Scan with Faster Fourier
Convolutional Neural Networks [87.87578529398019]
Partial scan is a common approach to accelerate Magnetic Resonance Imaging (MRI) data acquisition in both 2D and 3D settings.
We propose a novel convolutional operator called Faster Fourier Convolution (FasterFC) to replace the two consecutive convolution operations.
A 2D accelerated MRI method, FasterFC-End-to-End-VarNet, which uses FasterFC to improve the sensitivity maps and reconstruction quality.
A 3D accelerated MRI method called FasterFC-based Single-to-group Network (FAS-Net) that utilizes a single-to-group algorithm to guide k-space domain reconstruction
arXiv Detail & Related papers (2023-06-05T13:53:57Z) - Improving Convolutional Neural Networks for Fault Diagnosis by
Assimilating Global Features [0.0]
This paper proposes a novel local-global CNN architecture that accounts for both local and global features for fault diagnosis.
The proposed LG-CNN can greatly improve the fault diagnosis performance without significantly increasing the model complexity.
arXiv Detail & Related papers (2022-10-03T16:49:16Z) - Increasing the Accuracy of a Neural Network Using Frequency Selective
Mesh-to-Grid Resampling [4.211128681972148]
We propose the use of keypoint frequency selective mesh-to-grid resampling (FSMR) for the processing of input data for neural networks.
We show that depending on the network architecture and classification task the application of FSMR during training aids learning process.
The classification accuracy can be increased by up to 4.31 percentage points for ResNet50 and the Oxflower17 dataset.
arXiv Detail & Related papers (2022-09-28T21:34:47Z) - Fault Detection and Classification of Aerospace Sensors using a
VGG16-based Deep Neural Network [1.2599533416395765]
A concept known as imagefication-based intelligent FDC has been studied in recent years.
In this paper, we first propose a data augmentation method which inflates the stacked image to a larger size.
The FDC neural network is then trained via fine-tuning the VGG16 directly.
arXiv Detail & Related papers (2022-07-27T03:14:17Z) - CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade.
CNNs are capable of learning robust representations of the data directly from the RGB pixels.
Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z) - DCT-SNN: Using DCT to Distribute Spatial Information over Time for
Learning Low-Latency Spiking Neural Networks [7.876001630578417]
Spiking Neural Networks (SNNs) offer a promising alternative to traditional deep learning frameworks.
SNNs suffer from high inference latency which is a major bottleneck to their deployment.
We propose a scalable time-based encoding scheme that utilizes the Discrete Cosine Transform (DCT) to reduce the number of timesteps required for inference.
arXiv Detail & Related papers (2020-10-05T05:55:34Z) - Learning CNN filters from user-drawn image markers for coconut-tree
image classification [78.42152902652215]
We present a method that needs a minimal set of user-selected images to train the CNN's feature extractor.
The method learns the filters of each convolutional layer from user-drawn markers in image regions that discriminate classes.
It does not rely on optimization based on backpropagation, and we demonstrate its advantages on the binary classification of coconut-tree aerial images.
arXiv Detail & Related papers (2020-08-08T15:50:23Z) - Frequency learning for image classification [1.9336815376402716]
This paper presents a new approach for exploring the Fourier transform of the input images, which is composed of trainable frequency filters.
We propose a slicing procedure to allow the network to learn both global and local features from the frequency-domain representations of the image blocks.
arXiv Detail & Related papers (2020-06-28T00:32:47Z) - R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection.
Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.