EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox
- URL: http://arxiv.org/abs/2111.14243v1
- Date: Sun, 28 Nov 2021 21:32:31 GMT
- Title: EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox
- Authors: Priyank Kalgaonkar, Mohamed El-Sharkawy
- Abstract summary: Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intelligent edge devices with built-in processors vary widely in terms of
capability and physical form to perform advanced Computer Vision (CV) tasks
such as image classification and object detection, for example. With constant
advances in the field of autonomous cars and UAVs, embedded systems and mobile
devices, there has been an ever-growing demand for extremely efficient
Artificial Neural Networks (ANN) for real-time inference on these smart edge
devices with constrained computational resources. With unreliable network
connections in remote regions and an added complexity of data transmission, it
is of an utmost importance to capture and process data locally instead of
sending the data to cloud servers for remote processing. Edge devices on the
other hand, offer limited processing power due to their inexpensive hardware,
and limited cooling and computational resources. In this paper, we propose a
novel deep convolutional neural network architecture called EffCNet which is an
improved and an efficient version of CondenseNet Convolutional Neural Network
(CNN) for edge devices utilizing self-querying data augmentation and depthwise
separable convolutional strategies to improve real-time inference performance
as well as reduce the final trained model size, trainable parameters, and
Floating-Point Operations (FLOPs) of EffCNet CNN. Furthermore, extensive
supervised image classification analyses are conducted on two benchmarking
datasets: CIFAR-10 and CIFAR-100, to verify real-time inference performance of
our proposed CNN. Finally, we deploy these trained weights on NXP BlueBox which
is an intelligent edge development platform designed for self-driving vehicles
and UAVs, and conclusions will be extrapolated accordingly.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Evolution of Convolutional Neural Network (CNN): Compute vs Memory
bandwidth for Edge AI [0.0]
This article explores the relationship between CNN compute requirements and memory bandwidth in the context of Edge AI.
We examine the impact of increasing model complexity on both computational requirements and memory access patterns.
This analysis provides insights into designing efficient architectures and potential hardware accelerators in enhancing CNN performance on edge devices.
arXiv Detail & Related papers (2023-09-24T09:11:22Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel.
We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded
Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor.
In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z) - perf4sight: A toolflow to model CNN training performance on Edge GPUs [16.61258138725983]
This work proposes perf4sight, an automated methodology for developing accurate models that predict CNN training memory footprint and latency.
With PyTorch as the framework and NVIDIA Jetson TX2 as the target device, the developed models predict training memory footprint and latency with 95% and 91% accuracy respectively.
arXiv Detail & Related papers (2021-08-12T07:55:37Z) - Cost-effective Machine Learning Inference Offload for Edge Computing [0.3149883354098941]
This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources.
The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud.
arXiv Detail & Related papers (2020-12-07T21:11:02Z) - Neural Compression and Filtering for Edge-assisted Real-time Object
Detection in Challenged Networks [8.291242737118482]
We focus on edge computing supporting remote object detection by means of Deep Neural Networks (DNNs)
We develop a framework to reduce the amount of data transmitted over the wireless link.
The proposed technique represents an effective intermediate option between local and edge computing in a parameter region.
arXiv Detail & Related papers (2020-07-31T03:11:46Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.