Insight Gained from Migrating a Machine Learning Model to Intelligence Processing Units
- URL: http://arxiv.org/abs/2404.10730v1
- Date: Tue, 16 Apr 2024 17:02:52 GMT
- Title: Insight Gained from Migrating a Machine Learning Model to Intelligence Processing Units
- Authors: Hieu Le, Zhenhua He, Mai Le, Dhruva K. Chakravorty, Lisa M. Perez, Akhil Chilumuru, Yan Yao, Jiefu Chen,
- Abstract summary: Intelligence Processing Units (IPUs) offer a viable accelerator alternative to GPUs for machine learning (ML) applications.
We investigate the process of migrating a model from GPU to IPU and explore several optimization techniques, including pipelining and gradient accumulation.
We observe significantly improved performance with the Bow IPU when compared to its predecessor, the Colossus IPU.
- Score: 8.782847610934635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The discoveries in this paper show that Intelligence Processing Units (IPUs) offer a viable accelerator alternative to GPUs for machine learning (ML) applications within the fields of materials science and battery research. We investigate the process of migrating a model from GPU to IPU and explore several optimization techniques, including pipelining and gradient accumulation, aimed at enhancing the performance of IPU-based models. Furthermore, we have effectively migrated a specialized model to the IPU platform. This model is employed for predicting effective conductivity, a parameter crucial in ion transport processes, which govern the performance of multiple charge and discharge cycles of batteries. The model utilizes a Convolutional Neural Network (CNN) architecture to perform prediction tasks for effective conductivity. The performance of this model on the IPU is found to be comparable to its execution on GPUs. We also analyze the utilization and performance of Graphcore's Bow IPU. Through benchmark tests, we observe significantly improved performance with the Bow IPU when compared to its predecessor, the Colossus IPU.
Related papers
- EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference [49.94169109038806]
This paper introduces EPS-MoE, a novel expert pipeline scheduler for MoE.
Our results demonstrate an average 21% improvement in prefill throughput over existing parallel inference methods.
arXiv Detail & Related papers (2024-10-16T05:17:49Z) - Performance Tuning for GPU-Embedded Systems: Machine-Learning-based and
Analytical Model-driven Tuning Methodologies [0.0]
The study introduces an analytical model-driven tuning methodology and a Machine Learning (ML)-based tuning methodology.
We evaluate the performance of the two tuning methodologies for different parallel prefix implementations of the BPLG library in an NVIDIA Jetson system.
arXiv Detail & Related papers (2023-10-24T22:09:03Z) - Heterogeneous Integration of In-Memory Analog Computing Architectures
with Tensor Processing Units [0.0]
This paper introduces a novel, heterogeneous, mixed-signal, and mixed-precision architecture that integrates an IMAC unit with an edge TPU to enhance mobile CNN performance.
We propose a unified learning algorithm that incorporates mixed-precision training techniques to mitigate potential accuracy drops when deploying models on the TPU-IMAC architecture.
arXiv Detail & Related papers (2023-04-18T19:44:56Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - The Preliminary Results on Analysis of TAIGA-IACT Images Using
Convolutional Neural Networks [68.8204255655161]
The aim of the work is to study the possibility of the machine learning application to solve the tasks set for TAIGA-IACT.
The method of Convolutional Neural Networks (CNN) was applied to process and analyze Monte-Carlo events simulated with CORSIKA.
arXiv Detail & Related papers (2021-12-19T15:17:20Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Performance and Power Modeling and Prediction Using MuMMI and Ten
Machine Learning Methods [0.13764085113103217]
We use modeling and prediction tool MuMMI and ten machine learning methods to model and predict performance and power.
Experiment results show that the prediction error rates in performance and power using MuMMI are less than 10% for most cases.
arXiv Detail & Related papers (2020-11-12T21:24:11Z) - Learning Discrete Energy-based Models via Auxiliary-variable Local
Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data.
We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration.
We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z) - Using Machine Learning to Emulate Agent-Based Simulations [0.0]
We evaluate the performance of multiple machine-learning methods as statistical emulators for use in the analysis of agent-based models (ABMs)
We propose that agent-based modelling would benefit from using machine-learning methods for emulation, as this can facilitate more robust sensitivity analyses for the models.
arXiv Detail & Related papers (2020-05-05T11:48:36Z) - Graphcore C2 Card performance for image-based deep learning application:
A Report [0.3149883354098941]
Graphcore has introduced an IPU Processor for accelerating machine learning applications.
We report on a benchmark in which we have evaluated the performance of IPU processors on deep neural networks for inference.
arXiv Detail & Related papers (2020-02-26T17:58:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.