A Comprehensive Performance Study of Large Language Models on Novel AI
Accelerators
- URL: http://arxiv.org/abs/2310.04607v1
- Date: Fri, 6 Oct 2023 21:55:57 GMT
- Title: A Comprehensive Performance Study of Large Language Models on Novel AI
Accelerators
- Authors: Murali Emani, Sam Foreman, Varuni Sastry, Zhen Xie, Siddhisanket
Raskar, William Arnold, Rajeev Thakur, Venkatram Vishwanath, Michael E. Papka
- Abstract summary: Large language models (LLMs) are being considered as a promising approach to address some of the challenging problems.
Specialized AI accelerator hardware systems have recently become available for accelerating AI applications.
- Score: 2.88634411143577
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence (AI) methods have become critical in scientific
applications to help accelerate scientific discovery. Large language models
(LLMs) are being considered as a promising approach to address some of the
challenging problems because of their superior generalization capabilities
across domains. The effectiveness of the models and the accuracy of the
applications is contingent upon their efficient execution on the underlying
hardware infrastructure. Specialized AI accelerator hardware systems have
recently become available for accelerating AI applications. However, the
comparative performance of these AI accelerators on large language models has
not been previously studied. In this paper, we systematically study LLMs on
multiple AI accelerators and GPUs and evaluate their performance
characteristics for these models. We evaluate these systems with (i) a
micro-benchmark using a core transformer block, (ii) a GPT- 2 model, and (iii)
an LLM-driven science use case, GenSLM. We present our findings and analyses of
the models' performance to better understand the intrinsic capabilities of AI
accelerators. Furthermore, our analysis takes into account key factors such as
sequence lengths, scaling behavior, sparsity, and sensitivity to gradient
accumulation steps.
Related papers
- ML Research Benchmark [0.0]
We present the ML Research Benchmark (MLRB), comprising 7 competition-level tasks derived from recent machine learning conference tracks.
This paper introduces a novel benchmark and evaluates it using agent scaffolds powered by frontier models, including Claude-3 and GPT-4o.
The results indicate that the Claude-3.5 Sonnet agent performs best across our benchmark, excelling in planning and developing machine learning models.
arXiv Detail & Related papers (2024-10-29T21:38:42Z) - Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI.
As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios.
This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z) - Over the Edge of Chaos? Excess Complexity as a Roadblock to Artificial General Intelligence [4.901955678857442]
We posited the existence of critical points, akin to phase transitions in complex systems, where AI performance might plateau or regress into instability upon exceeding a critical complexity threshold.
Our simulations demonstrated how increasing the complexity of the AI system could exceed an upper criticality threshold, leading to unpredictable performance behaviours.
arXiv Detail & Related papers (2024-07-04T05:46:39Z) - Automated Text Scoring in the Age of Generative AI for the GPU-poor [49.1574468325115]
We analyze the performance and efficiency of open-source, small-scale generative language models for automated text scoring.
Results show that GLMs can be fine-tuned to achieve adequate, though not state-of-the-art, performance.
arXiv Detail & Related papers (2024-07-02T01:17:01Z) - Using the Abstract Computer Architecture Description Language to Model
AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements.
The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams.
In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs [14.397623940689487]
Graphcore Intelligence Processing Unit (IPU), Sambanova Reconfigurable Dataflow Unit (RDU), and enhanced GPU platforms are reviewed.
This research provides a preliminary evaluation and comparison of these commercial AI/ML accelerators.
arXiv Detail & Related papers (2023-11-08T01:06:25Z) - AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline.
We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters.
This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
arXiv Detail & Related papers (2023-05-04T02:09:43Z) - Deep learning applied to computational mechanics: A comprehensive
review, state of the art, and the classics [77.34726150561087]
Recent developments in artificial neural networks, particularly deep learning (DL), are reviewed in detail.
Both hybrid and pure machine learning (ML) methods are discussed.
History and limitations of AI are recounted and discussed, with particular attention at pointing out misstatements or misconceptions of the classics.
arXiv Detail & Related papers (2022-12-18T02:03:00Z) - Trends in Energy Estimates for Computing in AI/Machine Learning
Accelerators, Supercomputers, and Compute-Intensive Applications [3.2634122554914]
We examine the computational energy requirements of different systems driven by the geometrical scaling law.
We show that energy efficiency due to geometrical scaling is slowing down.
At the application level, general-purpose AI-ML methods can be computationally energy intensive.
arXiv Detail & Related papers (2022-10-12T16:14:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.