Context-aware Multi-Model Object Detection for Diversely Heterogeneous
Compute Systems
- URL: http://arxiv.org/abs/2402.07415v1
- Date: Mon, 12 Feb 2024 05:38:11 GMT
- Title: Context-aware Multi-Model Object Detection for Diversely Heterogeneous
Compute Systems
- Authors: Justin Davis and Mehmet E. Belviranli
- Abstract summary: One-size-fits-all approach to object detection using deep neural networks (DNNs) leads to inefficient utilization of computational resources.
We propose SHIFT which continuously selects from a variety of DNN-based OD models depending on the dynamically changing contextual information and computational constraints.
Our proposed methodology results in improvements of up to 7.5x in energy usage and 2.8x in latency compared to state-of-the-art GPU-based single model OD approaches.
- Score: 0.32634122554914
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, deep neural networks (DNNs) have gained widespread adoption
for continuous mobile object detection (OD) tasks, particularly in autonomous
systems. However, a prevalent issue in their deployment is the
one-size-fits-all approach, where a single DNN is used, resulting in
inefficient utilization of computational resources. This inefficiency is
particularly detrimental in energy-constrained systems, as it degrades overall
system efficiency. We identify that, the contextual information embedded in the
input data stream (e.g. the frames in the camera feed that the OD models are
run on) could be exploited to allow a more efficient multi-model-based OD
process. In this paper, we propose SHIFT which continuously selects from a
variety of DNN-based OD models depending on the dynamically changing contextual
information and computational constraints. During this selection, SHIFT
uniquely considers multi-accelerator execution to better optimize the
energy-efficiency while satisfying the latency constraints. Our proposed
methodology results in improvements of up to 7.5x in energy usage and 2.8x in
latency compared to state-of-the-art GPU-based single model OD approaches.
Related papers
- Energy-Aware FPGA Implementation of Spiking Neural Network with LIF Neurons [0.5243460995467893]
Spiking Neural Networks (SNNs) stand out as a cutting-edge solution for TinyML.
This paper presents a novel SNN architecture based on the 1st Order Leaky Integrate-and-Fire (LIF) neuron model.
A hardware-friendly LIF design is also proposed, and implemented on a Xilinx Artix-7 FPGA.
arXiv Detail & Related papers (2024-11-03T16:42:10Z) - CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads [4.556037016746581]
This article addresses the challenges inherent in optimising the execution of deep neural networks (DNNs) on mobile devices.
We introduce CARIn, a novel framework designed for the optimised deployment of both single- and multi-DNN applications.
We observe a substantial enhancement in the fair treatment of the problem's objectives, reaching 1.92x when compared to single-model designs and up to 10.69x in contrast to the state-of-the-art OODIn framework.
arXiv Detail & Related papers (2024-09-02T09:18:11Z) - DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse
Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices.
We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling.
Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone.
This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge.
We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z) - Energy-Efficient Model Compression and Splitting for Collaborative
Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.