A Many-ported and Shared Memory Architecture for High-Performance ADAS
SoCs
- URL: http://arxiv.org/abs/2209.05731v1
- Date: Tue, 13 Sep 2022 04:58:27 GMT
- Title: A Many-ported and Shared Memory Architecture for High-Performance ADAS
SoCs
- Authors: Hao Luan, Yu Yao, Chang Huang
- Abstract summary: We present a shared memory architecture that enables high data throughput among native parallel accesses to ADAS applications.
The results validate that the proposed architecture provides close to 100% throughput for both read and write accesses.
It can also provide consistent to the domain specific payloads while enabling the scalability and modularity of the design.
- Score: 11.760927352147798
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Increasing investment in computing technologies and the advancements in
silicon technology has fueled rapid growth in advanced driver assistance
systems (ADAS) and corresponding SoC developments. An ADAS SoC represents a
heterogeneous architecture that consists of CPUs, GPUs and artificial
intelligence (AI) accelerators. In order to guarantee its safety and
reliability, it must process massive amount of raw data collected from multiple
redundant sources such as high-definition video cameras, Radars, and Lidars to
recognize objects correctly and to make the right decisions promptly. A domain
specific memory architecture is essential to achieve the above goals. We
present a shared memory architecture that enables high data throughput among
multiple parallel accesses native to the ADAS applications. It also provides
deterministic access latency with proper isolation under the stringent
real-time QoS constraints. A prototype is built and analyzed. The results
validate that the proposed architecture provides close to 100\% throughput for
both read and write accesses generated simultaneously by many accessing masters
with full injection rate. It can also provide consistent QoS to the domain
specific payloads while enabling the scalability and modularity of the design.
Related papers
- AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation [48.82264764771652]
We introduce AsCAN -- a hybrid architecture, combining both convolutional and transformer blocks.
AsCAN supports a variety of tasks: recognition, segmentation, class-conditional image generation.
We then scale the same architecture to solve a large-scale text-to-image task and show state-of-the-art performance.
arXiv Detail & Related papers (2024-11-07T18:43:17Z) - Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI.
As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios.
This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z) - Using the Abstract Computer Architecture Description Language to Model
AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements.
The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams.
In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z) - Search-time Efficient Device Constraints-Aware Neural Architecture
Search [6.527454079441765]
Deep learning techniques like computer vision and natural language processing can be computationally expensive and memory-intensive.
We automate the construction of task-specific deep learning architectures optimized for device constraints through Neural Architecture Search (NAS)
We present DCA-NAS, a principled method of fast neural network architecture search that incorporates edge-device constraints.
arXiv Detail & Related papers (2023-07-10T09:52:28Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process.
We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z) - How to Reach Real-Time AI on Consumer Devices? Solutions for
Programmable and Custom Architectures [7.085772863979686]
Deep neural networks (DNNs) have led to large strides in various Artificial Intelligence (AI) inference tasks, such as object and speech recognition.
deploying such AI models across commodity devices faces significant challenges.
We present techniques for achieving real-time performance following a cross-stack approach.
arXiv Detail & Related papers (2021-06-21T11:23:12Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.