Related papers: Deep Learning on Edge TPUs

Deep Learning on Edge TPUs

URL: http://arxiv.org/abs/2108.13732v1
Date: Tue, 31 Aug 2021 10:23:37 GMT
Title: Deep Learning on Edge TPUs
Authors: Andreas M Kist
Abstract summary: I review the Edge TPU platform, the tasks that have been accomplished using the Edge TPU, and which steps are necessary to deploy a model to the Edge TPU hardware. The Edge TPU is not only capable of tackling common computer vision tasks, but also surpasses other hardware accelerators. Co-embedding the Edge TPU in cameras allows a seamless analysis of primary data.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computing at the edge is important in remote settings, however, conventional hardware is not optimized for utilizing deep neural networks. The Google Edge TPU is an emerging hardware accelerator that is cost, power and speed efficient, and is available for prototyping and production purposes. Here, I review the Edge TPU platform, the tasks that have been accomplished using the Edge TPU, and which steps are necessary to deploy a model to the Edge TPU hardware. The Edge TPU is not only capable of tackling common computer vision tasks, but also surpasses other hardware accelerators, especially when the entire model can be deployed to the Edge TPU. Co-embedding the Edge TPU in cameras allows a seamless analysis of primary data. In summary, the Edge TPU is a maturing system that has proven its usability across multiple tasks.

Related papers

Exploration of TPUs for AI Applications [0.0]
Processing Units (TPUs) are specialized hardware accelerators for deep learning developed by Google. This paper aims to explore TPUs in cloud and edge computing focusing on its applications in AI.
arXiv Detail & Related papers (2023-09-16T07:58:05Z)
FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems [61.335229621081346]
Federated Learning (FL) has become a viable technique for realizing privacy-enhancing distributed deep learning on the network edge. In this paper, we propose FLEdge, which complements existing FL benchmarks by enabling a systematic evaluation of client capabilities.
arXiv Detail & Related papers (2023-06-08T13:11:20Z)
Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for Multi-task Learning in Computer Vision Tasks for Robotic Grasping on the Edge [80.88063189896718]
High architectural and computational complexity can result in poor suitability for deployment on embedded devices. Fast GraspNeXt is a fast self-attention neural network architecture tailored for embedded multi-task learning in computer vision tasks for robotic grasping.
arXiv Detail & Related papers (2023-04-21T18:07:14Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process. We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z)
MAPLE-Edge: A Runtime Latency Predictor for Edge Devices [80.01591186546793]
We propose MAPLE-Edge, an edge device-oriented extension of MAPLE, the state-of-the-art latency predictor for general purpose hardware. Compared to MAPLE, MAPLE-Edge can describe the runtime and target device platform using a much smaller set of CPU performance counters. We also demonstrate that unlike MAPLE which performs best when trained on a pool of devices sharing a common runtime, MAPLE-Edge can effectively generalize across runtimes.
arXiv Detail & Related papers (2022-04-27T14:00:48Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Exploring Deep Neural Networks on Edge TPU [2.9573904824595614]
This paper explores the performance of Google's Edge TPU on feed forward neural networks. We compare the energy efficiency of Edge TPU with that of widely-used embedded CPU ARM Cortex-A53.
arXiv Detail & Related papers (2021-10-17T14:01:26Z)
Exploring Edge TPU for Network Intrusion Detection in IoT [2.8873930745906957]
This paper explores Google's Edge TPU for implementing a practical network intrusion detection system (NIDS) at the edge of IoT, based on a deep learning approach. Various scaled model sizes of two major deep neural network architectures are used to investigate these three metrics. The performance of the Edge TPU-based implementation is compared with that of an energy efficient embedded CPU (ARM Cortex A53)
arXiv Detail & Related papers (2021-03-30T12:43:57Z)
An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks [2.7584363116322863]
Edge TPUs are accelerators for low-power, edge devices and are widely used in various Google products such as Coral and Pixel devices. We extensively evaluate three classes of Edge TPUs, covering different computing ecosystems, that are either currently deployed in Google products or are the product pipeline. We present our efforts in developing high-accuracy learned machine learning models to estimate the major performance metrics of accelerators.
arXiv Detail & Related papers (2021-02-20T19:25:09Z)
Accelerator-aware Neural Network Design using AutoML [5.33024001730262]
We present a class of computer vision models designed using hardware-aware neural architecture search and customized to run on the Edge TPU. For the Edge TPU in Coral devices, these models enable real-time image classification performance while achieving accuracy typically seen only with larger, compute-heavy models running in data centers.
arXiv Detail & Related papers (2020-03-05T21:34:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.