Distributed On-Sensor Compute System for AR/VR Devices: A
Semi-Analytical Simulation Framework for Power Estimation
- URL: http://arxiv.org/abs/2203.07474v1
- Date: Mon, 14 Mar 2022 20:18:24 GMT
- Title: Distributed On-Sensor Compute System for AR/VR Devices: A
Semi-Analytical Simulation Framework for Power Estimation
- Authors: Jorge Gomez, Saavan Patel, Syed Shakib Sarwar, Ziyun Li, Raffaele
Capoccia, Zhao Wang, Reid Pinkham, Andrew Berkovich, Tsung-Hsun Tsai, Barbara
De Salvo and Chiao Liu
- Abstract summary: We show that a novel distributed on-sensor compute architecture can reduce the system power consumption compared to a centralized system.
We show that, in the case of the compute-intensive machine learning based Hand Tracking algorithm, the distributed on-sensor compute architecture can reduce the system power consumption.
- Score: 2.5696683295721883
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Augmented Reality/Virtual Reality (AR/VR) glasses are widely foreseen as the
next generation computing platform. AR/VR glasses are a complex "system of
systems" which must satisfy stringent form factor, computing-, power- and
thermal- requirements. In this paper, we will show that a novel distributed
on-sensor compute architecture, coupled with new semiconductor technologies
(such as dense 3D-IC interconnects and Spin-Transfer Torque Magneto Random
Access Memory, STT-MRAM) and, most importantly, a full hardware-software
co-optimization are the solutions to achieve attractive and socially acceptable
AR/VR glasses. To this end, we developed a semi-analytical simulation framework
to estimate the power consumption of novel AR/VR distributed on-sensor
computing architectures. The model allows the optimization of the main
technological features of the system modules, as well as the computer-vision
algorithm partition strategy across the distributed compute architecture. We
show that, in the case of the compute-intensive machine learning based Hand
Tracking algorithm, the distributed on-sensor compute architecture can reduce
the system power consumption compared to a centralized system, with the
additional benefits in terms of latency and privacy.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - A Realistic Simulation Framework for Analog/Digital Neuromorphic Architectures [73.65190161312555]
ARCANA is a spiking neural network simulator designed to account for the properties of mixed-signal neuromorphic circuits.
We show how the results obtained provide a reliable estimate of the behavior of the spiking neural network trained in software.
arXiv Detail & Related papers (2024-09-23T11:16:46Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Virtualization of Tiny Embedded Systems with a robust real-time capable
and extensible Stack Virtual Machine REXAVM supporting Material-integrated
Intelligent Systems and Tiny Machine Learning [0.0]
This paper shows and evaluates the suitability of the proposed VM architecture for operationally equivalent software and hardware (FPGA) implementations.
In a holistic architecture approach, the VM specifically addresses digital signal processing and tiny machine learning.
arXiv Detail & Related papers (2023-02-17T17:13:35Z) - An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN
Inference on Mobile Devices [4.117012092777604]
We develop an in-memory analog computing (IMAC) architecture realizing both synaptic behavior and activation functions within non-volatile memory arrays.
Spin-orbit torque magnetoresistive random-access memory (SOT-MRAM) devices are leveraged to realize sigmoidal neurons as well as binarized synapses.
A heterogeneous mixed-signal and mixed-precision CPU-IMAC architecture is proposed for convolutional neural networks (CNNs) inference on mobile processors.
arXiv Detail & Related papers (2021-05-24T23:01:36Z) - Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with
Heterogeneous Learning Tasks [53.1636151439562]
Mobile edge computing (MEC) provides a natural platform for AI applications.
We present an infrastructure to perform machine learning tasks at an MEC with the assistance of a reconfigurable intelligent surface (RIS)
Specifically, we minimize the learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station, and the phase-shift matrix of the RIS.
arXiv Detail & Related papers (2020-12-25T07:08:50Z) - Large-scale neuromorphic optoelectronic computing with a reconfigurable
diffractive processing unit [38.898230519968116]
We propose an optoelectronic reconfigurable computing paradigm by constructing a diffractive processing unit.
It can efficiently support different neural networks and achieve a high model complexity with millions of neurons.
Our prototype system built with off-the-shelf optoelectronic components surpasses the performance of state-of-the-art graphics processing units.
arXiv Detail & Related papers (2020-08-26T16:34:58Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z) - Near-Optimal Hardware Design for Convolutional Neural Networks [0.0]
This study proposes a novel, special-purpose, and high-efficiency hardware architecture for convolutional neural networks.
The proposed architecture maximizes the utilization of multipliers by designing the computational circuit with the same structure as that of the computational flow of the model.
An implementation based on the proposed hardware architecture has been applied in commercial AI products.
arXiv Detail & Related papers (2020-02-06T09:15:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.