SAIH: A Scalable Evaluation Methodology for Understanding AI Performance
Trend on HPC Systems
- URL: http://arxiv.org/abs/2212.03410v1
- Date: Wed, 7 Dec 2022 02:42:29 GMT
- Title: SAIH: A Scalable Evaluation Methodology for Understanding AI Performance
Trend on HPC Systems
- Authors: Jiangsu Du, Dongsheng Li, Yingpeng Wen, Jiazhi Jiang, Dan Huang,
Xiangke Liao, and Yutong Lu
- Abstract summary: We propose a scalable evaluation methodology (SAIH) for analyzing the AI performance trend of HPC systems.
As the data and model constantly scale, we can investigate the trend and range of AI performance on HPC systems.
- Score: 18.699431277588637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Novel artificial intelligence (AI) technology has expedited various
scientific research, e.g., cosmology, physics and bioinformatics, inevitably
becoming a significant category of workload on high performance computing (HPC)
systems. Existing AI benchmarks tend to customize well-recognized AI
applications, so as to evaluate the AI performance of HPC systems under
predefined problem size, in terms of datasets and AI models. Due to lack of
scalability on the problem size, static AI benchmarks might be under competent
to help understand the performance trend of evolving AI applications on HPC
systems, in particular, the scientific AI applications on large-scale systems.
In this paper, we propose a scalable evaluation methodology (SAIH) for
analyzing the AI performance trend of HPC systems with scaling the problem
sizes of customized AI applications. To enable scalability, SAIH builds a set
of novel mechanisms for augmenting problem sizes. As the data and model
constantly scale, we can investigate the trend and range of AI performance on
HPC systems, and further diagnose system bottlenecks. To verify our
methodology, we augment a cosmological AI application to evaluate a real HPC
system equipped with GPUs as a case study of SAIH.
Related papers
- AI-Aided Kalman Filters [65.35350122917914]
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing.
Recent developments illustrate the possibility of fusing deep neural networks (DNNs) with classic Kalman-type filtering.
This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms.
arXiv Detail & Related papers (2024-10-16T06:47:53Z) - Over the Edge of Chaos? Excess Complexity as a Roadblock to Artificial General Intelligence [4.901955678857442]
We posited the existence of critical points, akin to phase transitions in complex systems, where AI performance might plateau or regress into instability upon exceeding a critical complexity threshold.
Our simulations demonstrated how increasing the complexity of the AI system could exceed an upper criticality threshold, leading to unpredictable performance behaviours.
arXiv Detail & Related papers (2024-07-04T05:46:39Z) - Revolutionizing System Reliability: The Role of AI in Predictive Maintenance Strategies [0.0]
The study explores how AI, especially machine learning and neural networks, is being used to enhance predictive maintenance strategies.
The article provides insights into the effectiveness and challenges of implementing AI-driven predictive maintenance.
arXiv Detail & Related papers (2024-04-20T19:31:05Z) - Neuromorphic hardware for sustainable AI data centers [3.011658333753524]
Neuromorphic hardware takes inspiration from how the brain processes information.
Despite its potential, neuromorphic hardware has not found its way into commercial AI data centers.
This article aims to increase awareness of the challenges of integrating neuromorphic hardware into data centers.
arXiv Detail & Related papers (2024-02-04T15:08:50Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Brain-Inspired Computational Intelligence via Predictive Coding [89.6335791546526]
Predictive coding (PC) has shown promising performance in machine intelligence tasks.
PC can model information processing in different brain areas, can be used in cognitive control and robotics.
arXiv Detail & Related papers (2023-08-15T16:37:16Z) - The Future of Fundamental Science Led by Generative Closed-Loop
Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole.
AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access.
Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z) - AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities
and Challenges [60.56413461109281]
Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big data generated by IT Operations processes.
We discuss in depth the key types of data emitted by IT Operations activities, the scale and challenges in analyzing them, and where they can be helpful.
We categorize the key AIOps tasks as - incident detection, failure prediction, root cause analysis and automated actions.
arXiv Detail & Related papers (2023-04-10T15:38:12Z) - Enabling Automated Machine Learning for Model-Driven AI Engineering [60.09869520679979]
We propose a novel approach to enable Model-Driven Software Engineering and Model-Driven AI Engineering.
In particular, we support Automated ML, thus assisting software engineers without deep AI knowledge in developing AI-intensive systems.
arXiv Detail & Related papers (2022-03-06T10:12:56Z) - Integrating Deep Learning in Domain Sciences at Exascale [2.241545093375334]
We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems efficiently.
We propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems.
We present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications with AI.
arXiv Detail & Related papers (2020-11-23T03:09:58Z) - AIPerf: Automated machine learning as an AI-HPC benchmark [17.57686674304368]
We propose an end-to-end benchmark suite utilizing automated machine learning (AutoML)
We implement the algorithms in a highly parallel and flexible way to ensure the efficiency and optimization potential on diverse systems.
With flexible workload and single metric, our benchmark can scale and rank AI- HPC easily.
arXiv Detail & Related papers (2020-08-17T08:06:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.