Related papers: Low-Latency Asynchronous Logic Design for Inference at the Edge

Low-Latency Asynchronous Logic Design for Inference at the Edge

URL: http://arxiv.org/abs/2012.03402v1
Date: Mon, 7 Dec 2020 00:40:52 GMT
Title: Low-Latency Asynchronous Logic Design for Inference at the Edge
Authors: Adrian Wheeldon, Alex Yakovlev, Rishad Shafik, Jordan Morris
Abstract summary: We propose a method for reduced area and power overhead of self-timed early-propagative asynchronous inference circuits. Due to natural resilience to timing as well as logic underpinning, the circuits are tolerant to variations in environment and supply voltage. Average latency of the proposed circuit is reduced by 10x compared with the synchronous implementation.
Score: 0.9831489366502301
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern internet of things (IoT) devices leverage machine learning inference using sensed data on-device rather than offloading them to the cloud. Commonly known as inference at-the-edge, this gives many benefits to the users, including personalization and security. However, such applications demand high energy efficiency and robustness. In this paper we propose a method for reduced area and power overhead of self-timed early-propagative asynchronous inference circuits, designed using the principles of learning automata. Due to natural resilience to timing as well as logic underpinning, the circuits are tolerant to variations in environment and supply voltage whilst enabling the lowest possible latency. Our method is exemplified through an inference datapath for a low power machine learning application. The circuit builds on the Tsetlin machine algorithm further enhancing its energy efficiency. Average latency of the proposed circuit is reduced by 10x compared with the synchronous implementation whilst maintaining similar area. Robustness of the proposed circuit is proven through post-synthesis simulation with 0.25 V to 1.2 V supply. Functional correctness is maintained and latency scales with gate delay as voltage is decreased.

Related papers

Digital Twin-Assisted Federated Learning with Blockchain in Multi-tier Computing Systems [67.14406100332671]
In Industry 4.0 systems, resource-constrained edge devices engage in frequent data interactions. This paper proposes a digital twin (DT) and federated digital twin (FL) scheme. The efficacy of our proposed cooperative interference-based FL process has been verified through numerical analysis.
arXiv Detail & Related papers (2024-11-04T17:48:02Z)
Resource Efficient Asynchronous Federated Learning for Digital Twin Empowered IoT Network [29.895766751146155]
Digital twin (DT) can provide real-time status and dynamic topology mapping for Internet of Things (IoT) devices. We develop a dynamic resource scheduling algorithm tailored for the asynchronous federated learning (FL)-based lightweight DT empowered IoT network. Specifically, our approach aims to minimize a multi-objective function that encompasses both energy consumption and latency.
arXiv Detail & Related papers (2024-08-26T14:28:51Z)
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning [42.862705980039784]
Transformer has shown promise in reinforcement learning to model time-varying features. It still suffers from the issues of low data efficiency and high inference latency. In this paper, we propose to investigate the task from a new perspective of the frequency domain.
arXiv Detail & Related papers (2024-05-30T09:43:59Z)
Time-Series Forecasting and Sequence Learning Using Memristor-based Reservoir System [2.6473021051027534]
We develop a memristor-based echo state network accelerator that features efficient temporal data processing and in-situ online learning. The proposed design is benchmarked using various datasets involving real-world tasks, such as forecasting the load energy consumption and weather conditions. It is observed that the system demonstrates reasonable robustness for device failure below 10%, which may occur due to stuck-at faults.
arXiv Detail & Related papers (2024-05-22T05:07:56Z)
Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z)
PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling [84.34811206119619]
We propose a two-stage approach to pre-routing timing prediction. First, we propose global circuit training to pre-train a graph auto-encoder that learns the global graph embedding from circuit netlist. Second, we use a novel node updating scheme for message passing on GCN, following the topological sorting sequence of the learned graph embedding and circuit graph. Experiments on 21 real world circuits achieve a new SOTA R2 of 0.93 for slack prediction, which is significantly surpasses 0.59 by previous SOTA method.
arXiv Detail & Related papers (2024-02-27T02:23:07Z)
Neuromorphic analog circuits for robust on-chip always-on learning in spiking neural networks [1.9809266426888898]
Mixed-signal neuromorphic systems represent a promising solution for solving extreme-edge computing tasks. Their spiking neural network circuits are optimized for processing sensory data on-line in continuous-time. We design on-chip learning circuits with short-term analog dynamics and long-term tristate discretization mechanisms.
arXiv Detail & Related papers (2023-07-12T11:14:25Z)
Self-timed Reinforcement Learning using Tsetlin Machine [1.104960878651584]
We present a hardware design for the learning datapath of the Tsetlin machine algorithm, along with a latency analysis of the inference datapath. Results illustrate the advantages of asynchronous design in applications such as personalized healthcare and battery-powered internet of things devices.
arXiv Detail & Related papers (2021-09-02T11:24:23Z)
Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes. Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z)
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference [82.1584439276834]
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks. We present EdgeBERT, an in-depth algorithm- hardware co-design for latency-aware energy optimization for multi-task NLP.
arXiv Detail & Related papers (2020-11-28T19:21:47Z)
Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.