Virtualization of Tiny Embedded Systems with a robust real-time capable
and extensible Stack Virtual Machine REXAVM supporting Material-integrated
Intelligent Systems and Tiny Machine Learning
- URL: http://arxiv.org/abs/2302.09002v1
- Date: Fri, 17 Feb 2023 17:13:35 GMT
- Title: Virtualization of Tiny Embedded Systems with a robust real-time capable
and extensible Stack Virtual Machine REXAVM supporting Material-integrated
Intelligent Systems and Tiny Machine Learning
- Authors: Stefan Bosse, Sarah Bornemann, Bj\"orn L\"ussem
- Abstract summary: This paper shows and evaluates the suitability of the proposed VM architecture for operationally equivalent software and hardware (FPGA) implementations.
In a holistic architecture approach, the VM specifically addresses digital signal processing and tiny machine learning.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the past decades, there has been a significant increase in sensor density
and sensor deployment, driven by a significant miniaturization and decrease in
size down to the chip level, addressing ubiquitous computing, edge computing,
as well as distributed sensor networks. Material-integrated and intelligent
systems (MIIS) provide the next integration and application level, but they
create new challenges and introduce hard constraints (resources, energy supply,
communication, resilience, and security). Commonly, low-resource systems are
statically programmed processors with application-specific software or
application-specific hardware (FPGA). This work demonstrates the need for and
solution to virtualization in such low-resource and constrained systems towards
resilient distributed sensor and cyber-physical networks using a unified
low-resource, customizable, and real-time capable embedded and extensible stack
virtual machine (REXAVM) that can be implemented and cooperate in both software
and hardware. In a holistic architecture approach, the VM specifically
addresses digital signal processing and tiny machine learning. The REXAVM is
highly customizable through the use of VM program code generators at compile
time and incremental code processing at run time. The VM uses an integrated,
highly efficient just-in-time compiler to create Bytecode from text code. This
paper shows and evaluates the suitability of the proposed VM architecture for
operationally equivalent software and hardware (FPGA) implementations. Specific
components supporting tiny ML and DSP using fixed-point arithmetic with respect
to efficiency and accuracy are discussed. An extended use-case section
demonstrates the usability of the introduced VM architecture for a broad range
of applications.
Related papers
- Hybrid Oscillator-Qubit Quantum Processors: Instruction Set Architectures, Abstract Machine Models, and Applications [32.40067565226366]
We show that hybrid CV-DV hardware offers a powerful computational paradigm that inherits the strengths of both DV and CV processors.
We present a variety of new hybrid CV-DV compilation techniques, algorithms, and applications.
Hybrid CV-DV quantum computations are beginning to be performed in superconducting, trapped ion, and neutral atom platforms.
arXiv Detail & Related papers (2024-07-15T01:23:47Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - Using the Abstract Computer Architecture Description Language to Model
AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements.
The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams.
In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - FusionAI: Decentralized Training and Deploying LLMs with Massive
Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU.
This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z) - Efficient Machine Learning, Compilers, and Optimizations for Embedded
Systems [21.098443474303462]
Deep Neural Networks (DNNs) have achieved great success in a massive number of artificial intelligence (AI) applications by delivering high-quality computer vision, natural language processing, and virtual reality applications.
These emerging AI applications also come with increasing computation and memory demands, which are challenging to handle especially for the embedded systems where limited/memory resources, tight power budgets, and small form factors are demanded.
This book chapter introduces a series of effective design methods to enable efficient algorithms, compilers, and various optimizations for embedded systems.
arXiv Detail & Related papers (2022-06-06T02:54:05Z) - Distributed On-Sensor Compute System for AR/VR Devices: A
Semi-Analytical Simulation Framework for Power Estimation [2.5696683295721883]
We show that a novel distributed on-sensor compute architecture can reduce the system power consumption compared to a centralized system.
We show that, in the case of the compute-intensive machine learning based Hand Tracking algorithm, the distributed on-sensor compute architecture can reduce the system power consumption.
arXiv Detail & Related papers (2022-03-14T20:18:24Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End
Inference of Real-World Deep Neural Networks [12.361842554233558]
Deployment of modern TinyML tasks on small battery-constrained IoT devices requires high computational energy efficiency.
Analog In-Memory Computing (IMC) using non-volatile memory (NVM) promises major efficiency improvements in deep neural network (DNN) inference.
We present a heterogeneous tightly-coupled architecture integrating 8 RISC-V cores, an in-memory computing accelerator (IMA), and digital accelerators.
arXiv Detail & Related papers (2022-01-04T11:12:01Z) - Resistive Neural Hardware Accelerators [0.46198289193451136]
ReRAM-based in-memory computing has great potential in the implementation of area and power efficient inference.
The shift towards ReRAM-based in-memory computing has great potential in the implementation of area and power efficient inference.
In this survey, we review the state-of-the-art ReRAM-based Deep Neural Networks (DNNs) many-core accelerators.
arXiv Detail & Related papers (2021-09-08T21:11:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.