Related papers: The Role of Advanced Computer Architectures in Accelerating Artificial Intelligence Workloads

The Role of Advanced Computer Architectures in Accelerating Artificial Intelligence Workloads

URL: http://arxiv.org/abs/2511.10010v1
Date: Fri, 14 Nov 2025 01:25:42 GMT
Title: The Role of Advanced Computer Architectures in Accelerating Artificial Intelligence Workloads
Authors: Shahid Amin, Syed Pervez Hussnain Shah,
Abstract summary: The remarkable progress in Artificial Intelligence (AI) is foundation-ally linked to a concurrent revolution in computer architecture.<n>As AI models, particularly Deep Neural Networks (DNNs), have grown in complexity, their massive computational demands have pushed traditional architectures to their limits.<n>This paper provides a structured review of this co-evolution, analyzing the architectural landscape designed to accelerate modern AI workloads.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The remarkable progress in Artificial Intelligence (AI) is foundation-ally linked to a concurrent revolution in computer architecture. As AI models, particularly Deep Neural Networks (DNNs), have grown in complexity, their massive computational demands have pushed traditional architectures to their limits. This paper provides a structured review of this co-evolution, analyzing the architectural landscape designed to accelerate modern AI workloads. We explore the dominant architectural paradigms Graphics Processing Units (GPUs), Appli-cation-Specific Integrated Circuits (ASICs), and Field-Programmable Gate Ar-rays (FPGAs) by breaking down their design philosophies, key features, and per-formance trade-offs. The core principles essential for performance and energy efficiency, including dataflow optimization, advanced memory hierarchies, spar-sity, and quantization, are analyzed. Furthermore, this paper looks ahead to emerging technologies such as Processing-in-Memory (PIM) and neuromorphic computing, which may redefine future computation. By synthesizing architec-tural principles with quantitative performance data from industry-standard benchmarks, this survey presents a comprehensive picture of the AI accelerator landscape. We conclude that AI and computer architecture are in a symbiotic relationship, where hardware-software co-design is no longer an optimization but a necessity for future progress in computing.

Related papers

A Survey on Cloud-Edge-Terminal Collaborative Intelligence in AIoT Networks [49.90474228895655]
Cloud-edge-terminal collaborative intelligence (CETCI) is a fundamental paradigm within the artificial intelligence of things (AIoT) community.<n>CETCI has made significant progress with emerging AIoT applications, moving beyond isolated layer optimization to deployable collaborative intelligence systems.<n>This survey describes foundational architectures, enabling technologies, and scenarios of CETCI paradigms, offering a tutorial-style review for CISAIOT beginners.
arXiv Detail & Related papers (2025-08-26T08:38:01Z)
Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU Architectures [0.0]
Large-scale deep learning and artificial intelligence model training uses a lot of computational power and energy, so it poses serious sustainability issues.<n>This work explores environmentally driven performance optimization methods for advanced GPU architectures from NVIDIA, AMD, and other emerging GPU architectures.<n>Our main focus is on investigating hardware-software compiler co-design techniques meant to significantly increase memory-level and kernel-level operations.
arXiv Detail & Related papers (2025-07-28T03:25:44Z)
Edge-Cloud Collaborative Computing on Distributed Intelligence and Model Optimization: A Survey [58.50944604905037]
Edge-cloud collaborative computing (ECCC) has emerged as a pivotal paradigm for addressing the computational demands of modern intelligent applications.<n>Recent advancements in AI, particularly deep learning and large language models (LLMs), have dramatically enhanced the capabilities of these distributed systems.<n>This survey provides a structured tutorial on fundamental architectures, enabling technologies, and emerging applications.
arXiv Detail & Related papers (2025-05-03T13:55:38Z)
Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI. As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios. This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z)
Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives. We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis. We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z)
Using the Abstract Computer Architecture Description Language to Model AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements. The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams. In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z)
Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs [14.397623940689487]
Graphcore Intelligence Processing Unit (IPU), Sambanova Reconfigurable Dataflow Unit (RDU), and enhanced GPU platforms are reviewed. This research provides a preliminary evaluation and comparison of these commercial AI/ML accelerators.
arXiv Detail & Related papers (2023-11-08T01:06:25Z)
Bottom-up and top-down approaches for the design of neuromorphic processing systems: Tradeoffs and synergies between natural and artificial intelligence [3.874729481138221]
Moore's law has driven exponential computing power expectations, its nearing end calls for new avenues for improving the overall system performance. One of these avenues is the exploration of alternative brain-inspired computing architectures that aim at achieving the flexibility and computational efficiency of biological neural processing systems. We provide a comprehensive overview of the field, highlighting the different levels of granularity at which this paradigm shift is realized.
arXiv Detail & Related papers (2021-06-02T16:51:45Z)
A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures. A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
Convergence of Artificial Intelligence and High Performance Computing on NSF-supported Cyberinfrastructure [3.4291439418246177]
Artificial Intelligence (AI) applications have powered transformational solutions for big data challenges in industry and technology. As AI continues to evolve into a computing paradigm endowed with statistical and mathematical rigor, it has become apparent that single- GPU solutions for training, validation, and testing are no longer sufficient. This realization has been driving the confluence of AI and high performance computing to reduce time-to-insight.
arXiv Detail & Related papers (2020-03-18T18:00:02Z)
CAAI -- A Cognitive Architecture to Introduce Artificial Intelligence in Cyber-Physical Production Systems [1.5701326192371183]
CAAI is a cognitive architecture for artificial intelligence in cyber-physical production systems. The core of CAAI is a cognitive module that processes declarative goals of the user. Constant observation and evaluation against performance criteria assess the performance of pipelines.
arXiv Detail & Related papers (2020-02-26T16:27:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.