Related papers: Optimizing Neural Network for Computer Vision task in Edge Device

Optimizing Neural Network for Computer Vision task in Edge Device

URL: http://arxiv.org/abs/2110.00791v1
Date: Sat, 2 Oct 2021 12:25:18 GMT
Title: Optimizing Neural Network for Computer Vision task in Edge Device
Authors: Ranjith M S, S Parameshwara, Pavan Yadav A, Shriganesh Hegde
Abstract summary: We deploy a convolution neural network on the edge device itself. The computational expense for edge devices is reduced by reducing the floating-point precision of the parameters in the model. This makes an edge device to predict from the neural network all by itself.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The field of computer vision has grown very rapidly in the past few years due to networks like convolution neural networks and their variants. The memory required to store the model and computational expense are very high for such a network limiting it to deploy on the edge device. Many times, applications rely on the cloud but that makes it hard for working in real-time due to round-trip delays. We overcome these problems by deploying the neural network on the edge device itself. The computational expense for edge devices is reduced by reducing the floating-point precision of the parameters in the model. After this the memory required for the model decreases and the speed of the computation increases where the performance of the model is least affected. This makes an edge device to predict from the neural network all by itself.

Related papers

Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization. Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z)
OLLA: Decreasing the Memory Usage of Neural Networks by Optimizing the Lifetime and Location of Arrays [6.418232942455968]
OLLA is an algorithm that optimize the lifetime and memory location of the tensors used to train neural networks. We present several techniques to simplify the encoding of the problem, and enable our approach to scale to the size of state-of-the-art neural networks.
arXiv Detail & Related papers (2022-10-24T02:39:13Z)
Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x. We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
Network Augmentation for Tiny Deep Learning [73.57192520534585]
We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks. We demonstrate the effectiveness of NetAug on image classification and object detection.
arXiv Detail & Related papers (2021-10-17T18:48:41Z)
perf4sight: A toolflow to model CNN training performance on Edge GPUs [16.61258138725983]
This work proposes perf4sight, an automated methodology for developing accurate models that predict CNN training memory footprint and latency. With PyTorch as the framework and NVIDIA Jetson TX2 as the target device, the developed models predict training memory footprint and latency with 95% and 91% accuracy respectively.
arXiv Detail & Related papers (2021-08-12T07:55:37Z)
ItNet: iterative neural networks with small graphs for accurate and efficient anytime prediction [1.52292571922932]
In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs. We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets.
arXiv Detail & Related papers (2021-01-21T15:56:29Z)
Robust error bounds for quantised and pruned neural networks [1.8083503268672914]
Machine learning algorithms are moving towards decentralisation with the data and algorithms stored, and even trained, locally on devices. The device hardware becomes the main bottleneck for model capability in this set-up, creating a need for slimmed down, more efficient neural networks. A semi-definite program is introduced to bound the worst-case error caused by pruning or quantising a neural network. It is hoped that the computed bounds will provide certainty to the performance of these algorithms when deployed on safety-critical systems.
arXiv Detail & Related papers (2020-11-30T22:19:44Z)
Making DensePose fast and light [78.49552144907513]
Existing neural network models capable of solving this task are heavily parameterized. To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection. In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast.
arXiv Detail & Related papers (2020-06-26T19:42:20Z)
Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices [10.876317610988059]
We present a memory-aware compiler, dubbed SERENITY, that finds a sequence that finds a schedule with optimal memory footprint. Our solution also comprises of graph rewriting technique that allows further reduction beyond the optimum.
arXiv Detail & Related papers (2020-03-04T23:38:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.