Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception
- URL: http://arxiv.org/abs/2411.16007v1
- Date: Sun, 24 Nov 2024 22:59:11 GMT
- Title: Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception
- Authors: Mohanad Odema, Luke Chen, Hyoukjun Kwon, Mohammad Abdullah Al Faruque,
- Abstract summary: We study the application of emerging chiplet-based Neural Processing Units to accelerate vehicular AI perception workloads in constrained automotive settings.
Our approach realizes 82% and 2.8x increase in throughput and processing engines utilization compared to monolithic accelerator designs.
- Score: 12.416683044819955
- License:
- Abstract: We study the application of emerging chiplet-based Neural Processing Units to accelerate vehicular AI perception workloads in constrained automotive settings. The motivation stems from how chiplets technology is becoming integral to emerging vehicular architectures, providing a cost-effective trade-off between performance, modularity, and customization; and from perception models being the most computationally demanding workloads in a autonomous driving system. Using the Tesla Autopilot perception pipeline as a case study, we first breakdown its constituent models and profile their performance on different chiplet accelerators. From the insights, we propose a novel scheduling strategy to efficiently deploy perception workloads on multi-chip AI accelerators. Our experiments using a standard DNN performance simulator, MAESTRO, show our approach realizes 82% and 2.8x increase in throughput and processing engines utilization compared to monolithic accelerator designs.
Related papers
- Efficient Motion Prediction: A Lightweight & Accurate Trajectory Prediction Model With Fast Training and Inference Speed [56.27022390372502]
We propose a new efficient motion prediction model, which achieves highly competitive benchmark results while training only a few hours on a single GPU.
Its low inference latency makes it particularly suitable for deployment in autonomous applications with limited computing resources.
arXiv Detail & Related papers (2024-09-24T14:58:27Z) - Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI.
As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios.
This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z) - MetaFollower: Adaptable Personalized Autonomous Car Following [63.90050686330677]
We propose an adaptable personalized car-following framework - MetaFollower.
We first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from various CF events.
We additionally combine Long Short-Term Memory (LSTM) and Intelligent Driver Model (IDM) to reflect temporal heterogeneity with high interpretability.
arXiv Detail & Related papers (2024-06-23T15:30:40Z) - A Car Model Identification System for Streamlining the Automobile Sales
Process [0.0]
This project presents an automated solution for the efficient identification of car models and makes from images.
We achieved a notable accuracy of 81.97% employing the EfficientNet (V2 b2) architecture.
The trained model offers the potential for automating information extraction, promising enhanced user experiences across car-selling websites.
arXiv Detail & Related papers (2023-10-19T23:36:17Z) - FastRLAP: A System for Learning High-Speed Driving via Deep RL and
Autonomous Practicing [71.76084256567599]
We present a system that enables an autonomous small-scale RC car to drive aggressively from visual observations using reinforcement learning (RL)
Our system, FastRLAP (faster lap), trains autonomously in the real world, without human interventions, and without requiring any simulation or expert demonstrations.
The resulting policies exhibit emergent aggressive driving skills, such as timing braking and acceleration around turns and avoiding areas which impede the robot's motion, approaching the performance of a human driver using a similar first-person interface over the course of training.
arXiv Detail & Related papers (2023-04-19T17:33:47Z) - Penalty-Based Imitation Learning With Cross Semantics Generation Sensor
Fusion for Autonomous Driving [1.2749527861829049]
In this paper, we provide a penalty-based imitation learning approach to integrate multiple modalities of information.
We observe a remarkable increase in the driving score by more than 12% when compared to the state-of-the-art (SOTA) model, InterFuser.
Our model achieves this performance enhancement while achieving a 7-fold increase in inference speed and reducing the model size by approximately 30%.
arXiv Detail & Related papers (2023-03-21T14:29:52Z) - Tackling Real-World Autonomous Driving using Deep Reinforcement Learning [63.3756530844707]
In this work, we propose a model-free Deep Reinforcement Learning Planner training a neural network that predicts acceleration and steering angle.
In order to deploy the system on board the real self-driving car, we also develop a module represented by a tiny neural network.
arXiv Detail & Related papers (2022-07-05T16:33:20Z) - Bayesian Optimization and Deep Learning forsteering wheel angle
prediction [58.720142291102135]
This work aims to obtain an accurate model for the prediction of the steering angle in an automated driving system.
BO was able to identify, within a limited number of trials, a model -- namely BOST-LSTM -- which resulted, the most accurate when compared to classical end-to-end driving models.
arXiv Detail & Related papers (2021-10-22T15:25:14Z) - Vision-Based Autonomous Car Racing Using Deep Imitative Reinforcement
Learning [13.699336307578488]
Deep imitative reinforcement learning approach (DIRL) achieves agile autonomous racing using visual inputs.
We validate our algorithm both in a high-fidelity driving simulation and on a real-world 1/20-scale RC-car with limited onboard computation.
arXiv Detail & Related papers (2021-07-18T00:00:48Z) - Cloud2Edge Elastic AI Framework for Prototyping and Deployment of AI
Inference Engines in Autonomous Vehicles [1.688204090869186]
This paper proposes a novel framework for developing AI Inference Engines for autonomous driving applications based on deep learning modules.
We introduce a simple yet elegant solution for the AI components development cycle, where prototyping takes place in the cloud according to the Software-in-the-Loop (SiL) paradigm.
The effectiveness of the proposed framework is demonstrated using two real-world use-cases of AI inference engines for autonomous vehicles.
arXiv Detail & Related papers (2020-09-23T09:23:29Z) - A Learned Performance Model for Tensor Processing Units [5.733911161090224]
We demonstrate a method of learning performance models from a corpus of graph programs for Processing Unit (TPU) instances.
We show that our learned model outperforms a heavily-optimized analytical performance model on two tasks.
It helps an autotuner discover faster programs in a setting where access to TPUs is limited or expensive.
arXiv Detail & Related papers (2020-08-03T17:24:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.