Related papers: Hardware.jl - An MLIR-based Julia HLS Flow (Work in Progress)

Related papers

A High-level Synthesis Toolchain for the Julia Language [1.7995266833057173]
We propose a new MLIR-based compiler toolchain that unifies the development process by automatically compiling kernels written in the Julia programming language into SystemVerilog.<n>Our toolchain supports both dynamic and static scheduling, directly integrates with the AXI4-Stream protocol, and generates vendor-agnostic RTL.<n>This prototype toolchain is able to synthesize a set of signal processing/mathematical benchmarks that can operate at 100MHz on real FPGA devices.
arXiv Detail & Related papers (2025-12-17T18:32:06Z)
Library Liberation: Competitive Performance Matmul Through Compiler-composed Nanokernels [37.00431889602245]
This paper introduces a compilation scheme that automatically generates scalable, high-performance micro Kernels.<n>We implement this technique in an MLIR-based compiler supporting both vector and tile based CPU instructions.<n>Experiments show that the generated nano Kernels are of production-quality, and competitive with state-of-the-art micro Kernel libraries.
arXiv Detail & Related papers (2025-11-14T14:32:28Z)
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
RAG-Based Fuzzing of Cross-Architecture Compilers [0.8302146576157498]
OneAPI is an open standard that supports cross-architecture software development with minimal effort from developers. OneAPI brings DPC++ and C++ compilers which need to be thoroughly tested to verify their correctness, reliability, and security. This paper proposes a large-language model (LLM)-based compiler fuzzing tool that integrates the concept of retrieval-augmented generation (RAG)
arXiv Detail & Related papers (2025-04-11T20:46:52Z)
LLM-Aided Compilation for Tensor Accelerators [6.709490736813537]
We discuss how large language models (LLMs) could be leveraged to build a compiler for hardware accelerators. Specifically, we demonstrate the ability of GPT-4 to achieve high pass rates in translating code to the Gemmini accelerator. We also propose a 2-phase workflow for utilizing LLMs to generate hardware-optimized code.
arXiv Detail & Related papers (2024-08-06T19:10:25Z)
OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer. We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z)
The GitHub Development Workflow Automation Ecosystems [47.818229204130596]
Large-scale software development has become a highly collaborative endeavour. This chapter explores the ecosystems of development bots and GitHub Actions. It provides an extensive survey of the state-of-the-art in this domain.
arXiv Detail & Related papers (2023-05-08T15:24:23Z)
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures [67.47328776279204]
This work introduces a framework to develop efficient, portable Deep Learning and High Performance Computing kernels. We decompose the kernel development in two steps: 1) Expressing the computational core using Processing Primitives (TPPs) and 2) Expressing the logical loops around TPPs in a high-level, declarative fashion. We demonstrate the efficacy of our approach using standalone kernels and end-to-end workloads that outperform state-of-the-art implementations on diverse CPU platforms.
arXiv Detail & Related papers (2023-04-25T05:04:44Z)
CFU Playground: Full-Stack Open-Source Framework for Tiny Machine Learning (tinyML) Acceleration on FPGAs [2.2177069086277195]
CFU Playground is a full-stack open-source framework that enables rapid and iterative design of machine learning (ML) accelerators for embedded ML systems. Our tool provides a completely open-source end-to-end flow for hardware-software co-design on FPGAs and future systems research. Our rapid, deploy-profile-optimization feedback loop lets ML hardware and software developers achieve significant returns out of a relatively small investment.
arXiv Detail & Related papers (2022-01-05T23:15:58Z)
Bring Your Own Codegen to Deep Learning Compiler [8.87545486816377]
This paper proposes an open source framework that enables users to only concentrate on the development of their proprietary code generation tools. Our framework provides users flexible and easy-to-use interfaces to partition their models into segments that can be executed on "the best" processors.
arXiv Detail & Related papers (2021-05-03T17:22:25Z)
PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives. We develop novel data reuse analysis algorithms using the polyhedral model. We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z)
Julia Language in Machine Learning: Algorithms, Applications, and Open Issues [5.666843255747851]
Machine learning is driving development across many fields in science and engineering. Currently, the programming languages most commonly used to develop machine learning algorithms include Python, and C/C ++. This paper summarizes the related research work and developments in the application of the Julia language in machine learning.
arXiv Detail & Related papers (2020-03-23T09:31:02Z)
PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives [55.79741270235602]
We develop a hybrid solution to the development of deep learning kernels. We use the advanced polyhedral technology to automatically tune the outer loops for performance.
arXiv Detail & Related papers (2020-02-06T08:02:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.