Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation
- URL: http://arxiv.org/abs/2406.17749v1
- Date: Tue, 25 Jun 2024 17:34:52 GMT
- Title: Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation
- Authors: Tushar Prasanna Swaminathan, Christopher Silver, Thangarajah Akilan,
- Abstract summary: This work empirically investigates the optimization of complex deep learning models to analyze their functionality on an embedded device.
It evaluates the effectiveness of the optimized models in terms of their inference speed for image classification and video action detection.
- Score: 2.3636539018632616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The proliferation of complex deep learning (DL) models has revolutionized various applications, including computer vision-based solutions, prompting their integration into real-time systems. However, the resource-intensive nature of these models poses challenges for deployment on low-computational power and low-memory devices, like embedded and edge devices. This work empirically investigates the optimization of such complex DL models to analyze their functionality on an embedded device, particularly on the NVIDIA Jetson Nano. It evaluates the effectiveness of the optimized models in terms of their inference speed for image classification and video action detection. The experimental results reveal that, on average, optimized models exhibit a 16.11% speed improvement over their non-optimized counterparts. This not only emphasizes the critical need to consider hardware constraints and environmental sustainability in model development and deployment but also underscores the pivotal role of model optimization in enabling the widespread deployment of AI-assisted technologies on resource-constrained computational systems. It also serves as proof that prioritizing hardware-specific model optimization leads to efficient and scalable solutions that substantially decrease energy consumption and carbon footprint.
Related papers
- On Accelerating Edge AI: Optimizing Resource-Constrained Environments [1.7355861031903428]
Resource-constrained edge deployments demand AI solutions that balance high performance with stringent compute, memory, and energy limitations.
We present a comprehensive overview of the primary strategies for accelerating deep learning models under such constraints.
arXiv Detail & Related papers (2025-01-25T01:37:03Z) - Low-Rank Adapters Meet Neural Architecture Search for LLM Compression [1.8434042562191815]
The rapid expansion of Large Language Models (LLMs) has posed significant challenges regarding the computational resources required for fine-tuning and deployment.
Recent advancements in low-rank adapters have demonstrated their efficacy in parameter-efficient fine-tuning (PEFT) of these models.
This paper comprehensively discusses innovative approaches that synergize low-rank representations with Neural Architecture Search (NAS) techniques.
arXiv Detail & Related papers (2025-01-23T02:14:08Z) - A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.
deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.
This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z) - Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs.
We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections.
Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Structured Model Pruning for Efficient Inference in Computational Pathology [2.9687381456164004]
We develop a methodology for pruning the widely used U-Net-style architectures in biomedical imaging.
We empirically demonstrate that pruning can compress models by at least 70% with a negligible drop in performance.
arXiv Detail & Related papers (2024-04-12T22:05:01Z) - Robustness and Generalization Performance of Deep Learning Models on
Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise.
We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z) - RLFlow: Optimising Neural Network Subgraph Transformation with World
Models [0.0]
We propose a model-based agent which learns to optimise the architecture of neural networks by performing a sequence of subgraph transformations to reduce model runtime.
We show our approach can match the performance of state of the art on common convolutional networks and outperform those by up to 5% on transformer-style architectures.
arXiv Detail & Related papers (2022-05-03T11:52:54Z) - Real-time Neural-MPC: Deep Learning Model Predictive Control for
Quadrotors and Agile Robotic Platforms [59.03426963238452]
We present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline.
We show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
arXiv Detail & Related papers (2022-03-15T09:38:15Z) - Hybrid modeling: Applications in real-time diagnosis [64.5040763067757]
We outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models.
We are using such models for real-time diagnosis applications.
arXiv Detail & Related papers (2020-03-04T00:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.