Exploration of TPUs for AI Applications
- URL: http://arxiv.org/abs/2309.08918v2
- Date: Tue, 14 Nov 2023 18:30:45 GMT
- Title: Exploration of TPUs for AI Applications
- Authors: Diego Sanmart\'in Carri\'on, Vera Prohaska
- Abstract summary: Processing Units (TPUs) are specialized hardware accelerators for deep learning developed by Google.
This paper aims to explore TPUs in cloud and edge computing focusing on its applications in AI.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Tensor Processing Units (TPUs) are specialized hardware accelerators for deep
learning developed by Google. This paper aims to explore TPUs in cloud and edge
computing focusing on its applications in AI. We provide an overview of TPUs,
their general architecture, specifically their design in relation to neural
networks, compilation techniques and supporting frameworks. Furthermore, we
provide a comparative analysis of Cloud and Edge TPU performance against other
counterpart chip architectures. Our results show that TPUs can provide
significant performance improvements in both cloud and edge computing.
Additionally, this paper underscores the imperative need for further research
in optimization techniques for efficient deployment of AI architectures on the
Edge TPU and benchmarking standards for a more robust comparative analysis in
edge computing scenarios. The primary motivation behind this push for research
is that efficient AI acceleration, facilitated by TPUs, can lead to substantial
savings in terms of time, money, and environmental resources.
Related papers
- Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI.
As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios.
This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z) - Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms [77.71341200638416]
ChiPBench is a benchmark designed to evaluate the effectiveness of AI-based chip placement algorithms.
We have gathered 20 circuits from various domains (e.g., CPU, GPU, and microcontrollers) for evaluation.
Results show that even if intermediate metric of a single-point algorithm is dominant, the final PPA results are unsatisfactory.
arXiv Detail & Related papers (2024-07-03T03:29:23Z) - Networking Systems for Video Anomaly Detection: A Tutorial and Survey [55.28514053969056]
Video Anomaly Detection (VAD) is a fundamental research task within the Artificial Intelligence (AI) community.
In this article, we delineate the foundational assumptions, learning frameworks, and applicable scenarios of various deep learning-driven VAD routes.
We showcase our latest NSVAD research in industrial IoT and smart cities, along with an end-cloud collaborative architecture for deployable NSVAD.
arXiv Detail & Related papers (2024-05-16T02:00:44Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Heterogeneous Integration of In-Memory Analog Computing Architectures
with Tensor Processing Units [0.0]
This paper introduces a novel, heterogeneous, mixed-signal, and mixed-precision architecture that integrates an IMAC unit with an edge TPU to enhance mobile CNN performance.
We propose a unified learning algorithm that incorporates mixed-precision training techniques to mitigate potential accuracy drops when deploying models on the TPU-IMAC architecture.
arXiv Detail & Related papers (2023-04-18T19:44:56Z) - Edge-Cloud Polarization and Collaboration: A Comprehensive Survey [61.05059817550049]
We conduct a systematic review for both cloud and edge AI.
We are the first to set up the collaborative learning mechanism for cloud and edge modeling.
We discuss potentials and practical experiences of some on-going advanced edge AI topics.
arXiv Detail & Related papers (2021-11-11T05:58:23Z) - Exploring Deep Neural Networks on Edge TPU [2.9573904824595614]
This paper explores the performance of Google's Edge TPU on feed forward neural networks.
We compare the energy efficiency of Edge TPU with that of widely-used embedded CPU ARM Cortex-A53.
arXiv Detail & Related papers (2021-10-17T14:01:26Z) - Deep Learning on Edge TPUs [0.0]
I review the Edge TPU platform, the tasks that have been accomplished using the Edge TPU, and which steps are necessary to deploy a model to the Edge TPU hardware.
The Edge TPU is not only capable of tackling common computer vision tasks, but also surpasses other hardware accelerators.
Co-embedding the Edge TPU in cameras allows a seamless analysis of primary data.
arXiv Detail & Related papers (2021-08-31T10:23:37Z) - Exploring Edge TPU for Network Intrusion Detection in IoT [2.8873930745906957]
This paper explores Google's Edge TPU for implementing a practical network intrusion detection system (NIDS) at the edge of IoT, based on a deep learning approach.
Various scaled model sizes of two major deep neural network architectures are used to investigate these three metrics.
The performance of the Edge TPU-based implementation is compared with that of an energy efficient embedded CPU (ARM Cortex A53)
arXiv Detail & Related papers (2021-03-30T12:43:57Z) - An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks [2.7584363116322863]
Edge TPUs are accelerators for low-power, edge devices and are widely used in various Google products such as Coral and Pixel devices.
We extensively evaluate three classes of Edge TPUs, covering different computing ecosystems, that are either currently deployed in Google products or are the product pipeline.
We present our efforts in developing high-accuracy learned machine learning models to estimate the major performance metrics of accelerators.
arXiv Detail & Related papers (2021-02-20T19:25:09Z) - Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments.
It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.