Related papers: ProgSG: Cross-Modality Representation Learning for Programs in Electronic Design Automation

ProgSG: Cross-Modality Representation Learning for Programs in Electronic Design Automation

URL: http://arxiv.org/abs/2305.10838v2
Date: Fri, 2 Jun 2023 22:27:27 GMT
Title: ProgSG: Cross-Modality Representation Learning for Programs in Electronic Design Automation
Authors: Yunsheng Bai, Atefeh Sohrabizadeh, Zongyue Qin, Ziniu Hu, Yizhou Sun, Jason Cong
Abstract summary: High-level synthesis (HLS) allows a developer to compile a high-level description in the form of software code in C and C++. HLS tools still require microarchitecture decisions, expressed in terms of pragmas. We propose ProgSG allowing the source code sequence modality and the graph modalities to interact with each other in a deep and fine-grained way.
Score: 38.023395256208055
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent years have witnessed the growing popularity of domain-specific accelerators (DSAs), such as Google's TPUs, for accelerating various applications such as deep learning, search, autonomous driving, etc. To facilitate DSA designs, high-level synthesis (HLS) is used, which allows a developer to compile a high-level description in the form of software code in C and C++ into a design in low-level hardware description languages (such as VHDL or Verilog) and eventually synthesized into a DSA on an ASIC (application-specific integrated circuit) or FPGA (field-programmable gate arrays). However, existing HLS tools still require microarchitecture decisions, expressed in terms of pragmas (such as directives for parallelization and pipelining). To enable more people to design DSAs, it is desirable to automate such decisions with the help of deep learning for predicting the quality of HLS designs. This requires us a deeper understanding of the program, which is a combination of original code and pragmas. Naturally, these programs can be considered as sequence data, for which large language models (LLM) can help. In addition, these programs can be compiled and converted into a control data flow graph (CDFG), and the compiler also provides fine-grained alignment between the code tokens and the CDFG nodes. However, existing works either fail to leverage both modalities or combine the two in shallow or coarse ways. We propose ProgSG allowing the source code sequence modality and the graph modalities to interact with each other in a deep and fine-grained way. To alleviate the scarcity of labeled designs, a pre-training method is proposed based on a suite of compiler's data flow analysis tasks. Experimental results on two benchmark datasets show the superiority of ProgSG over baseline methods that either only consider one modality or combine the two without utilizing the alignment information.

Related papers

Execution Guided Line-by-Line Code Generation [49.1574468325115]
We present a novel approach to neural code generation that incorporates real-time execution signals into the language model generation process.<n>Our method, ExecutionGuidedFree Guidance (EGCFG), incorporates execution signals as model generates code.
arXiv Detail & Related papers (2025-06-12T17:50:05Z)
LIFT: LLM-Based Pragma Insertion for HLS via GNN Supervised Fine-Tuning [38.679497621876926]
LIFT is a large language model (LLM)-based coding assistant for HLS that automatically generates performance-critical pragmas. We fine-tune the LLM by tightly integrating and supervising the training process with a graph neural network (GNN)
arXiv Detail & Related papers (2025-04-29T21:42:59Z)
Hierarchical Mixture of Experts: Generalizable Learning for High-Level Synthesis [43.612837464039686]
High-level synthesis (HLS) is a widely used tool in designing Field Programmable Gate Array (FPGA) We propose a more domain-generalizable model structure: a two-level hierarchical Mixture of Experts (MoE) In the low-level MoE, we apply MoE on three natural granularities of a program: node, basic block, and graph. The high-level MoE learns to aggregate the three granularities for the final decision.
arXiv Detail & Related papers (2024-10-25T00:27:53Z)
OMPar: Automatic Parallelization with AI-Driven Source-to-Source Compilation [4.266086505323998]
This paper introduces OMPar, an AI-driven tool designed to automate the parallelization of C/C++ code using OpenMP pragmas. OMPar integrates Large Language Models (LLMs) through two key components: OMPify, which assesses loop parallelization potential, and MonoCoder-OMP, a new fine-tuned model which generates precise OpenMP pragmas.
arXiv Detail & Related papers (2024-09-23T07:39:01Z)
Case2Code: Scalable Synthetic Data for Code Generation [105.89741089673575]
Large Language Models (LLMs) have shown outstanding breakthroughs in code generation. Recent work improves code LLMs by training on synthetic data generated by some powerful LLMs. We propose a textbfCase2Code task by exploiting the expressiveness and correctness of programs.
arXiv Detail & Related papers (2024-07-17T11:35:00Z)
Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis [45.471039079664656]
Domain-specific accelerators (DSAs) have gained popularity for applications such as deep learning and autonomous driving. We propose ProgSG, a model that allows interaction between the source code sequence modality and the graph modality in a deep and fine-grained way. We show that ProgSG reduces the RMSE of design performance predictions by up to $22%$, and identifies designs with an average of $1.10times$.
arXiv Detail & Related papers (2024-06-13T22:34:58Z)
CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to better interpret the programming domain knowledge.<n>CodeGRAG significantly improves the code generation ability of LLMs and can even offer performance gain for cross-lingual code generation.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
Deep Learning Assisted Multiuser MIMO Load Modulated Systems for Enhanced Downlink mmWave Communications [68.96633803796003]
This paper is focused on multiuser load modulation arrays (MU-LMAs) which are attractive due to their low system complexity and reduced cost for millimeter wave (mmWave) multi-input multi-output (MIMO) systems. The existing precoding algorithm for downlink MU-LMA relies on a sub-array structured (SAS) transmitter which may suffer from decreased degrees of freedom and complex system configuration. In this paper, we conceive an MU-LMA system employing a full-array structured (FAS) transmitter and propose two algorithms accordingly.
arXiv Detail & Related papers (2023-11-08T08:54:56Z)
Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts. Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness. Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z)
SEER: Super-Optimization Explorer for HLS using E-graph Rewriting with MLIR [0.3124884279860061]
High-level synthesis (HLS) is a process that automatically translates a software program in a high-level language into a low-level hardware description. We propose a super-optimization approach for HLS that automatically rewrites an arbitrary software program into HLS efficient code. We show that SEER achieves up to 38x the performance within 1.4x the area of the original program.
arXiv Detail & Related papers (2023-08-15T09:05:27Z)
QParallel: Explicit Parallelism for Programming Quantum Computers [62.10004571940546]
We present a language extension for parallel quantum programming. QParallel removes ambiguities concerning parallelism in current quantum programming languages. We introduce a tool that guides programmers in the placement of parallel regions by identifying the subroutines that profit most from parallelization.
arXiv Detail & Related papers (2022-10-07T16:35:16Z)
A Graph Deep Learning Framework for High-Level Synthesis Design Space Exploration [11.154086943903696]
High-Level Synthesis is a solution for fast prototyping application-specific hardware. We propose HLS, for the first time in the literature, graph neural networks that jointly predict acceleration performance and hardware costs. We show that our approach achieves prediction accuracy comparable with that of commonly used simulators.
arXiv Detail & Related papers (2021-11-29T18:17:45Z)
Synthetic Datasets for Neural Program Synthesis [66.20924952964117]
We propose a new methodology for controlling and evaluating the bias of synthetic data distributions over both programs and specifications. We demonstrate, using the Karel DSL and a small Calculator DSL, that training deep networks on these distributions leads to improved cross-distribution generalization performance.
arXiv Detail & Related papers (2019-12-27T21:28:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.