Related papers: A Single Graph Convolution Is All You Need: Efficient Grayscale Image Classification

A Single Graph Convolution Is All You Need: Efficient Grayscale Image Classification

URL: http://arxiv.org/abs/2402.00564v6
Date: Wed, 26 Jun 2024 10:08:51 GMT
Title: A Single Graph Convolution Is All You Need: Efficient Grayscale Image Classification
Authors: Jacob Fein-Ashley, Sachini Wickramasinghe, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna,
Abstract summary: Grayscale image classification has critical applications in fields such as medical imaging and SAR ATR. We present a novel grayscale image classification approach using a vectorized view of images. Our approach incorporates a single graph convolutional layer in a batch-wise manner, enhancing accuracy and reducing performance variance.
Score: 3.0299904110792255
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Image classifiers for domain-specific tasks like Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) and chest X-ray classification often rely on convolutional neural networks (CNNs). These networks, while powerful, experience high latency due to the number of operations they perform, which can be problematic in real-time applications. Many image classification models are designed to work with both RGB and grayscale datasets, but classifiers that operate solely on grayscale images are less common. Grayscale image classification has critical applications in fields such as medical imaging and SAR ATR. In response, we present a novel grayscale image classification approach using a vectorized view of images. By leveraging the lightweight nature of Multi-Layer Perceptrons (MLPs), we treat images as vectors, simplifying the problem to grayscale image classification. Our approach incorporates a single graph convolutional layer in a batch-wise manner, enhancing accuracy and reducing performance variance. Additionally, we develop a customized accelerator on FPGA for our model, incorporating several optimizations to improve performance. Experimental results on benchmark grayscale image datasets demonstrate the effectiveness of our approach, achieving significantly lower latency (up to $16\times$ less on MSTAR) and competitive or superior performance compared to state-of-the-art models for SAR ATR and medical image classification.

Related papers

Color-Quality Invariance for Robust Medical Image Segmentation [4.710921988115686]
Single-source domain generalization is a significant challenge in medical image segmentation. We propose two novel techniques to enhance generalization: dynamic color image normalization (DCIN) module and color-quality generalization (CQG) loss.
arXiv Detail & Related papers (2025-02-11T02:47:37Z)
Enhanced Convolutional Neural Networks for Improved Image Classification [0.40964539027092917]
CIFAR-10 is a widely used benchmark to evaluate the performance of classification models on small-scale, multi-class datasets. We propose an enhanced CNN architecture that integrates deeper convolutional blocks, batch normalization, and dropout regularization to achieve superior performance.
arXiv Detail & Related papers (2025-02-02T04:32:25Z)
Predicting Satisfied User and Machine Ratio for Compressed Images: A Unified Approach [58.71009078356928]
We create a deep learning-based model to predict Satisfied User Ratio (SUR) and Satisfied Machine Ratio (SMR) of compressed images simultaneously. Experimental results indicate that the proposed model significantly outperforms state-of-the-art SUR and SMR prediction methods.
arXiv Detail & Related papers (2024-12-23T11:09:30Z)
Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation. Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack. General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors. We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z)
RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations [0.0]
Conditional affine transformations (CAT) have been applied to different layers of GAN to control content synthesis in images. We first model CAT and a recurrent neural network (RAT) to ensure that different layers can access global information. We then introduce shuffle attention between RAT to mitigate the characteristic of information forgetting in recurrent neural networks.
arXiv Detail & Related papers (2024-05-13T18:49:18Z)
Additional Look into GAN-based Augmentation for Deep Learning COVID-19 Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples. We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems. The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z)
Grid Jigsaw Representation with CLIP: A New Perspective on Image Clustering [37.15595383168132]
Jigsaw based strategy method for image clustering called Grid Jigsaw Representation (GJR) with systematic exposition from pixel to feature in discrepancy against human and computer. GJR modules are appended to a variety of deep convolutional networks and tested with significant improvements on a wide range of benchmark datasets. Experiment results show the effectiveness on the clustering task with respect to the ACC, NMI and ARI three metrics and super fast convergence speed.
arXiv Detail & Related papers (2023-10-27T03:07:05Z)
Adaptive Input-image Normalization for Solving the Mode Collapse Problem in GAN-based X-ray Images [0.08192907805418582]
This work contributes an empirical demonstration of the benefits of integrating the adaptive input-image normalization with the Deep Conversaal GAN and Auxiliary GAN to alleviate the mode collapse problems. Results demonstrate that the DCGAN and the ACGAN with adaptive input-image normalization outperform the DCGAN and ACGAN with un-normalized X-ray images.
arXiv Detail & Related papers (2023-09-21T16:43:29Z)
Performance of GAN-based augmentation for deep learning COVID-19 image classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data. Data augmentation is a typical methodology used in machine learning when confronted with a limited data set. In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z)
A novel approach for glaucoma classification by wavelet neural networks using graph-based, statisitcal features of qualitatively improved images [0.0]
We have proposed a new glaucoma classification approach that employs a wavelet neural network (WNN) on optimally enhanced retinal images features. The performance of the WNN classifier is compared with multilayer perceptron neural networks with various datasets.
arXiv Detail & Related papers (2022-06-24T06:19:30Z)
Learning Hierarchical Graph Representation for Image Manipulation Detection [50.04902159383709]
The objective of image manipulation detection is to identify and locate the manipulated regions in the images. Recent approaches mostly adopt the sophisticated Convolutional Neural Networks (CNNs) to capture the tampering artifacts left in the images. We propose a hierarchical Graph Convolutional Network (HGCN-Net), which consists of two parallel branches.
arXiv Detail & Related papers (2022-01-15T01:54:25Z)
Scalable Visual Transformers with Hierarchical Pooling [61.05787583247392]
We propose a Hierarchical Visual Transformer (HVT) which progressively pools visual tokens to shrink the sequence length. It brings a great benefit by scaling dimensions of depth/width/resolution/patch size without introducing extra computational complexity. Our HVT outperforms the competitive baselines on ImageNet and CIFAR-100 datasets.
arXiv Detail & Related papers (2021-03-19T03:55:58Z)
Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture. We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions. Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.