AI Accelerator Survey and Trends
- URL: http://arxiv.org/abs/2109.08957v1
- Date: Sat, 18 Sep 2021 15:57:47 GMT
- Title: AI Accelerator Survey and Trends
- Authors: Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally,
Siddharth Samsi, Jeremy Kepner
- Abstract summary: This paper updates the survey of AI accelerators and processors from past two years.
This paper collects and summarizes the current commercial accelerators that have been publicly announced with peak performance and power consumption numbers.
- Score: 4.722078109242797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past several years, new machine learning accelerators were being
announced and released every month for a variety of applications from speech
recognition, video object detection, assisted driving, and many data center
applications. This paper updates the survey of AI accelerators and processors
from past two years. This paper collects and summarizes the current commercial
accelerators that have been publicly announced with peak performance and power
consumption numbers. The performance and power values are plotted on a scatter
graph, and a number of dimensions and observations from the trends on this plot
are again discussed and analyzed. This year, we also compile a list of
benchmarking performance results and compute the computational efficiency with
respect to peak performance.
Related papers
- Accelerating AI Performance using Anderson Extrapolation on GPUs [2.114333871769023]
We present a novel approach for accelerating AI performance by leveraging Anderson extrapolation.
By identifying the crossover point where a mixing penalty is incurred, the method focuses on reducing iterations to convergence.
We demonstrate significant improvements in both training and inference, motivated by scalability and efficiency extensions to the realm of high-performance computing.
arXiv Detail & Related papers (2024-10-25T10:45:17Z) - Dynamic Data Pruning for Automatic Speech Recognition [58.95758272440217]
We introduce Dynamic Data Pruning for ASR (DDP-ASR), which offers fine-grained pruning granularities specifically tailored for speech-related datasets.
Our experiments show that DDP-ASR can save up to 1.6x training time with negligible performance loss.
arXiv Detail & Related papers (2024-06-26T14:17:36Z) - Insight Gained from Migrating a Machine Learning Model to Intelligence Processing Units [8.782847610934635]
Intelligence Processing Units (IPUs) offer a viable accelerator alternative to GPUs for machine learning (ML) applications.
We investigate the process of migrating a model from GPU to IPU and explore several optimization techniques, including pipelining and gradient accumulation.
We observe significantly improved performance with the Bow IPU when compared to its predecessor, the Colossus IPU.
arXiv Detail & Related papers (2024-04-16T17:02:52Z) - G-MEMP: Gaze-Enhanced Multimodal Ego-Motion Prediction in Driving [71.9040410238973]
We focus on inferring the ego trajectory of a driver's vehicle using their gaze data.
Next, we develop G-MEMP, a novel multimodal ego-trajectory prediction network that combines GPS and video input with gaze data.
The results show that G-MEMP significantly outperforms state-of-the-art methods in both benchmarks.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Lincoln AI Computing Survey (LAICS) Update [8.790207519640472]
This paper is an update of the survey of AI accelerators and processors from past four years.
It collects and summarizes the current commercial accelerators that have been publicly announced.
Market segments are highlighted on the scatter plot, and zoomed plots of each segment are also included.
arXiv Detail & Related papers (2023-10-13T14:36:26Z) - Performance Embeddings: A Similarity-based Approach to Automatic
Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications.
We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z) - Benchmarking Node Outlier Detection on Graphs [90.29966986023403]
Graph outlier detection is an emerging but crucial machine learning task with numerous applications.
We present the first comprehensive unsupervised node outlier detection benchmark for graphs called UNOD.
arXiv Detail & Related papers (2022-06-21T01:46:38Z) - Dynamic GPU Energy Optimization for Machine Learning Training Workloads [9.156075372403421]
GPOEO is an online GPU energy optimization framework for machine learning training workloads.
It employs novel techniques for online measurement, multi-objective prediction modeling, and search optimization.
Compared with the NVIDIA default scheduling strategy, GPOEO delivers a mean energy saving of 16.2% with a modest average execution time increase of 5.1%.
arXiv Detail & Related papers (2022-01-05T16:25:48Z) - Providing Meaningful Data Summarizations Using Examplar-based Clustering
in Industry 4.0 [67.80123919697971]
We show, that our GPU implementation provides speedups of up to 72x using single-precision and up to 452x using half-precision compared to conventional CPU algorithms.
We apply our algorithm to real-world data from injection molding manufacturing processes and discuss how found summaries help with steering this specific process to cut costs and reduce the manufacturing of bad parts.
arXiv Detail & Related papers (2021-05-25T15:55:14Z) - Survey of Machine Learning Accelerators [15.163544680926474]
This paper updates the survey of of AI accelerators and processors from last year's IEEE-HPEC paper.
This paper collects and summarizes the current accelerators that have been publicly announced with performance and power consumption numbers.
arXiv Detail & Related papers (2020-09-01T01:28:59Z) - PnPNet: End-to-End Perception and Prediction with Tracking in the Loop [82.97006521937101]
We tackle the problem of joint perception and motion forecasting in the context of self-driving vehicles.
We propose Net, an end-to-end model that takes as input sensor data, and outputs at each time step object tracks and their future level.
arXiv Detail & Related papers (2020-05-29T17:57:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.