Related papers: Compiler-Driven Simulation of Reconfigurable Hardware Accelerators

Compiler-Driven Simulation of Reconfigurable Hardware Accelerators

URL: http://arxiv.org/abs/2202.00739v1
Date: Tue, 1 Feb 2022 20:31:04 GMT
Title: Compiler-Driven Simulation of Reconfigurable Hardware Accelerators
Authors: Zhijing Li, Yuwei Ye, Stephen Neuendorffer, Adrian Sampso
Abstract summary: Existing simulators tend to two extremes: low-level and general approaches, such as RTL simulation, that can model any hardware but require substantial effort and long execution times. This work proposes a compiler-driven simulation workflow that can model hardware accelerator.
Score: 0.8807375890824978
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: As customized accelerator design has become increasingly popular to keep up with the demand for high performance computing, it poses challenges for modern simulator design to adapt to such a large variety of accelerators. Existing simulators tend to two extremes: low-level and general approaches, such as RTL simulation, that can model any hardware but require substantial effort and long execution times; and higher-level application-specific models that can be much faster and easier to use but require one-off engineering effort. This work proposes a compiler-driven simulation workflow that can model configurable hardware accelerator. The key idea is to separate structure representation from simulation by developing an intermediate language that can flexibly represent a wide variety of hardware constructs. We design the Event Queue (EQueue) dialect of MLIR, a dialect that can model arbitrary hardware accelerators with explicit data movement and distributed event-based control; we also implement a generic simulation engine to model EQueue programs with hybrid MLIR dialects representing different abstraction levels. We demonstrate two case studies of EQueue-implemented accelerators: the systolic array of convolution and SIMD processors in a modern FPGA. In the former we show EQueue simulation is as accurate as a state-of-the-art simulator, while offering higher extensibility and lower iteration cost via compiler passes. In the latter we demonstrate our simulation flow can guide designer efficiently improve their design using visualizable simulation outputs.

Related papers

Fast, Modular, and Differentiable Framework for Machine Learning-Enhanced Molecular Simulations [12.00988094580341]
We present an end-to-end differentiable molecular simulation framework (DIMOS) for molecular dynamics and Monte Carlo simulations. Thanks to its modularity, both classical and machine-learning-based approaches can be easily combined into a hybrid description of the system (ML/MM) The superior performance and the high versatility is probed in different benchmarks and applications, with speed-up factors of up to $170times$.
arXiv Detail & Related papers (2025-03-26T13:39:10Z)
Simulation Streams: A Programming Paradigm for Controlling Large Language Models and Building Complex Systems with Generative AI [3.3126968968429407]
Simulation Streams is a programming paradigm designed to efficiently control and leverage Large Language Models (LLMs) Our primary goal is to create a framework that harnesses the agentic abilities of LLMs while addressing their limitations in maintaining consistency.
arXiv Detail & Related papers (2025-01-30T16:38:03Z)
Tao: Re-Thinking DL-based Microarchitecture Simulation [8.501776613988484]
Existing microarchitecture simulators excel and fall short at different aspects. Deep learning (DL)-based simulations are remarkably fast and have acceptable accuracy but fail to provide adequate low-level microarchitectural performance metrics. This paper introduces TAO that redesigns the DL-based simulation with three primary contributions.
arXiv Detail & Related papers (2024-04-16T21:45:10Z)
CityFlowER: An Efficient and Realistic Traffic Simulator with Embedded Machine Learning Models [25.567208505574072]
CityFlowER is an advanced simulator for efficient and realistic city-wide traffic simulation. It pre-embeds Machine Learning models within the simulator, eliminating the need for external API interactions. It offers unparalleled flexibility and efficiency, particularly in large-scale simulations.
arXiv Detail & Related papers (2024-02-09T01:19:41Z)
Design-Space Exploration of SNN Models using Application-Specific Multi-Core Architectures [0.3599866690398789]
"RAVSim" is a cutting-edge SNN simulator, developed using and it is publicly available on their website as an official module. RAVSim is a runtime virtual simulation environment that enables the user to interact with the model, observe its behavior of output concentration, and modify the set of parametric values at any time while the simulation is in execution.
arXiv Detail & Related papers (2024-02-07T20:41:00Z)
Using the Abstract Computer Architecture Description Language to Model AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements. The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams. In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z)
DEAP: Design Space Exploration for DNN Accelerator Parallelism [0.0]
Large Language Models (LLMs) are becoming increasingly complex and powerful to train and serve. This paper showcases how hardware and software co-design can come together and allow us to create customized hardware systems.
arXiv Detail & Related papers (2023-12-24T02:43:01Z)
Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes. It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training. We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z)
In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks. This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z)
Data-Driven Offline Optimization For Architecting Hardware Accelerators [89.68870139177785]
We develop a data-driven offline optimization method for designing hardware accelerators, dubbed PRIME. PRIME improves performance upon state-of-the-art simulation-driven methods by about 1.54x and 1.20x, while considerably reducing the required total simulation time by 93% and 99%, respectively. In addition, PRIME also architects effective accelerators for unseen applications in a zero-shot setting, outperforming simulation-based methods by 1.26x.
arXiv Detail & Related papers (2021-10-20T17:06:09Z)
SimNet: Computer Architecture Simulation using Machine Learning [3.7019798164954336]
This work describes a concerted effort, where machine learning (ML) is used to accelerate discrete-event simulation. A GPU-accelerated parallel simulator is implemented based on the proposed instruction latency predictor. Its simulation accuracy and throughput are validated and evaluated against a state-of-the-art simulator.
arXiv Detail & Related papers (2021-05-12T17:31:52Z)
High-performance symbolic-numerics via multiple dispatch [52.77024349608834]
Symbolics.jl is an extendable symbolic system which uses dynamic multiple dispatch to change behavior depending on the domain needs. We show that by formalizing a generic API on actions independent of implementation, we can retroactively add optimized data structures to our system. We demonstrate the ability to swap between classical term-rewriting simplifiers and e-graph-based term-rewriting simplifiers.
arXiv Detail & Related papers (2021-05-09T14:22:43Z)
Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data. We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration. We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.