Related papers: Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System

Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System

URL: http://arxiv.org/abs/2307.16834v3
Date: Tue, 12 Sep 2023 22:42:53 GMT
Title: Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System
Authors: Hoang Viet Pham, Thinh Gia Tran, Chuong Dinh Le, An Dinh Le, Hien Bich Vo
Abstract summary: We implement an end-to-end video-based crime-scene anomaly detection system inputting from surveillance videos. The system is deployed and operates on multiple Jetson edge devices (Nano, AGX Xavier, Orin Nano) We provide the experience of an AI-based system deployment on various Jetson Edge devices with Docker technology.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Innovative enhancement in embedded system platforms, specifically hardware accelerations, significantly influence the application of deep learning in real-world scenarios. These innovations translate human labor efforts into automated intelligent systems employed in various areas such as autonomous driving, robotics, Internet-of-Things (IoT), and numerous other impactful applications. NVIDIA's Jetson platform is one of the pioneers in offering optimal performance regarding energy efficiency and throughput in the execution of deep learning algorithms. Previously, most benchmarking analysis was based on 2D images with a single deep learning model for each comparison result. In this paper, we implement an end-to-end video-based crime-scene anomaly detection system inputting from surveillance videos and the system is deployed and completely operates on multiple Jetson edge devices (Nano, AGX Xavier, Orin Nano). The comparison analysis includes the integration of Torch-TensorRT as a software developer kit from NVIDIA for the model performance optimisation. The system is built based on the PySlowfast open-source project from Facebook as the coding template. The end-to-end system process comprises the videos from camera, data preprocessing pipeline, feature extractor and the anomaly detection. We provide the experience of an AI-based system deployment on various Jetson Edge devices with Docker technology. Regarding anomaly detectors, a weakly supervised video-based deep learning model called Robust Temporal Feature Magnitude Learning (RTFM) is applied in the system. The approach system reaches 47.56 frames per second (FPS) inference speed on a Jetson edge device with only 3.11 GB RAM usage total. We also discover the promising Jetson device that the AI system achieves 15% better performance than the previous version of Jetson devices while consuming 50% less energy power.

Related papers

RAMOTS: A Real-Time System for Aerial Multi-Object Tracking based on Deep Learning and Big Data Technology [0.0]
Multi-object tracking (MOT) in UAV-based video is challenging due to variations in viewpoint, low resolution, and the presence of small objects. We propose a novel real-time MOT framework that integrates Apache Kafka and Apache Spark for efficient and fault-tolerant video stream processing.
arXiv Detail & Related papers (2025-02-06T03:46:18Z)
SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images [125.66499135980344]
We propose SparseGrasp, a novel open-vocabulary robotic grasping system. SparseGrasp operates efficiently with sparse-view RGB images and handles scene updates fastly. We show that SparseGrasp significantly outperforms state-of-the-art methods in terms of both speed and adaptability.
arXiv Detail & Related papers (2024-12-03T03:56:01Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Benchmarking Deep Learning Models for Object Detection on Edge Computing Devices [0.0]
We evaluate state-of-the-art object detection models, including YOLOv8 (Nano, Small, Medium), EfficientDet Lite (Lite0, Lite1, Lite2), and SSD (SSD MobileNet V1, SSDLite MobileDet) We deployed these models on popular edge devices like the Raspberry Pi 3, 4, and 5 with/without TPU accelerators, and Jetson Orin Nano, collecting key performance metrics such as energy consumption, inference time, and Mean Average Precision (mAP) Our findings highlight that lower mAP models such as SSD MobileNet V1 are more energy-efficient and faster in
arXiv Detail & Related papers (2024-09-25T10:56:49Z)
Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics [18.042752812489276]
We introduce an end-to-end edge-assisted video inference acceleration system based on Vision Transformer (ViT) Our findings reveal that Arena can boost inference speeds by up to 1.58(times) and 1.82(times) on average while consuming only 47% and 31% of the bandwidth, respectively, all with high inference accuracy.
arXiv Detail & Related papers (2024-04-14T13:14:13Z)
Random resistive memory-based deep extreme point learning machine for unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM) Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z)
Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors [117.61449210940955]
We propose an efficient abnormal event detection model based on a lightweight masked auto-encoder (AE) applied at the video frame level. We introduce an approach to weight tokens based on motion gradients, thus shifting the focus from the static background scene to the foreground objects. We generate synthetic abnormal events to augment the training videos, and task the masked AE model to jointly reconstruct the original frames.
arXiv Detail & Related papers (2023-06-21T06:18:05Z)
Benchmarking Edge Computing Devices for Grape Bunches and Trunks Detection using Accelerated Object Detection Single Shot MultiBox Deep Learning Models [2.1922186455344796]
This work benchmarks the performance of different platforms for object detection in real-time. Authors used the RetinaNet ResNet-50 fine-tuned using the natural Vine dataset.
arXiv Detail & Related papers (2022-11-21T17:02:33Z)
Deep Learning Computer Vision Algorithms for Real-time UAVs On-board Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs. All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z)
ETAD: A Unified Framework for Efficient Temporal Action Detection [70.21104995731085]
Untrimmed video understanding such as temporal action detection (TAD) often suffers from the pain of huge demand for computing resources. We build a unified framework for efficient end-to-end temporal action detection (ETAD) ETAD achieves state-of-the-art performance on both THUMOS-14 and ActivityNet-1.3.
arXiv Detail & Related papers (2022-05-14T21:16:21Z)
Real-time HOG+SVM based object detection using SoC FPGA for a UHD video stream [0.0]
We present a real-time implementation of the well-known pedestrian detector with HOG (Histogram of Oriented Gradients) feature extraction and SVM (Support Vector Machine) classification. The system is capable of detecting a pedestrian in a single scale.
arXiv Detail & Related papers (2022-04-22T10:29:21Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
Accelerating Deep Learning Applications in Space [0.0]
We investigate the performance of CNN-based object detectors on constrained devices. We take a closer look at the Single Shot MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN) The performance is measured in terms of inference time, memory consumption, and accuracy.
arXiv Detail & Related papers (2020-07-21T21:06:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.