EDALearn: A Comprehensive RTL-to-Signoff EDA Benchmark for Democratized
and Reproducible ML for EDA Research
- URL: http://arxiv.org/abs/2312.01674v1
- Date: Mon, 4 Dec 2023 06:51:46 GMT
- Title: EDALearn: A Comprehensive RTL-to-Signoff EDA Benchmark for Democratized
and Reproducible ML for EDA Research
- Authors: Jingyu Pan, Chen-Chia Chang, Zhiyao Xie, Yiran Chen
- Abstract summary: We introduce EDALearn, the first holistic, open-source benchmark suite specifically for Machine Learning tasks in EDA.
This benchmark suite presents an end-to-end flow from synthesis to physical implementation, enriching data collection across various stages.
Our contributions aim to encourage further advances in the ML-EDA domain.
- Score: 5.093676641214663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The application of Machine Learning (ML) in Electronic Design Automation
(EDA) for Very Large-Scale Integration (VLSI) design has garnered significant
research attention. Despite the requirement for extensive datasets to build
effective ML models, most studies are limited to smaller, internally generated
datasets due to the lack of comprehensive public resources. In response, we
introduce EDALearn, the first holistic, open-source benchmark suite
specifically for ML tasks in EDA. This benchmark suite presents an end-to-end
flow from synthesis to physical implementation, enriching data collection
across various stages. It fosters reproducibility and promotes research into ML
transferability across different technology nodes. Accommodating a wide range
of VLSI design instances and sizes, our benchmark aptly represents the
complexity of contemporary VLSI designs. Additionally, we provide an in-depth
data analysis, enabling users to fully comprehend the attributes and
distribution of our data, which is essential for creating efficient ML models.
Our contributions aim to encourage further advances in the ML-EDA domain.
Related papers
- Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation [56.75665429851673]
This paper introduces a novel instruction curation algorithm, derived from two unique perspectives, human and LLM preference alignment.
Experiments demonstrate that we can maintain or even improve model performance by compressing synthetic multimodal instructions by up to 90%.
arXiv Detail & Related papers (2024-09-27T08:20:59Z) - The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective [53.48484062444108]
We find that the development of models and data is not two separate paths but rather interconnected.
On the one hand, vaster and higher-quality data contribute to better performance of MLLMs; on the other hand, MLLMs can facilitate the development of data.
To promote the data-model co-development for MLLM community, we systematically review existing works related to MLLMs from the data-model co-development perspective.
arXiv Detail & Related papers (2024-07-11T15:08:11Z) - Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox [46.39670209441478]
Large language models (LLMs) have exhibited exciting progress in multiple scenarios.
As an effective means to reduce memory footprint and inference cost, quantization also faces challenges in performance degradation at low bit-widths.
This work provides a comprehensive benchmark suite for this research topic, including an evaluation system, detailed analyses, and a general toolbox.
arXiv Detail & Related papers (2024-06-15T12:02:14Z) - Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets [22.29915616018026]
Large Language Models (LLMs) have demonstrated unparalleled effectiveness in various NLP tasks.
Our research aims to evaluate the impact of various configurations of speech encoders, LLMs, and projector modules.
We introduce a three-stage training approach, expressly developed to enhance the model's ability to align auditory and textual information.
arXiv Detail & Related papers (2024-05-03T14:35:58Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Are You Being Tracked? Discover the Power of Zero-Shot Trajectory
Tracing with LLMs! [3.844253028598048]
This study introduces LLMTrack, a model that illustrates how LLMs can be leveraged for Zero-Shot Trajectory Recognition.
We evaluate the model using real-world datasets designed to challenge it with distinct trajectories characterized by indoor and outdoor scenarios.
arXiv Detail & Related papers (2024-03-10T12:50:35Z) - Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes [57.62036621319563]
We introduce CLLM, which leverages the prior knowledge of Large Language Models (LLMs) for data augmentation in the low-data regime.
We demonstrate the superior performance of CLLM in the low-data regime compared to conventional generators.
arXiv Detail & Related papers (2023-12-19T12:34:46Z) - Vision-Language Instruction Tuning: A Review and Analysis [52.218690619616474]
Vision-Language Instruction Tuning (VLIT) presents more complex characteristics compared to pure text instruction tuning.
We offer a detailed categorization for existing VLIT datasets and identify the characteristics that high-quality VLIT data should possess.
By incorporating these characteristics as guiding principles into the existing VLIT data construction process, we conduct extensive experiments and verify their positive impact on the performance of tuned multi-modal LLMs.
arXiv Detail & Related papers (2023-11-14T14:02:32Z) - Large Language Models as Data Preprocessors [9.99065004972981]
Large Language Models (LLMs) have marked a significant advancement in artificial intelligence.
This study explores their potential in data preprocessing, a critical stage in data mining and analytics applications.
We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques.
arXiv Detail & Related papers (2023-08-30T23:28:43Z) - MLLM-DataEngine: An Iterative Refinement Approach for MLLM [62.30753425449056]
We propose a novel closed-loop system that bridges data generation, model training, and evaluation.
Within each loop, the MLLM-DataEngine first analyze the weakness of the model based on the evaluation results.
For targeting, we propose an Adaptive Bad-case Sampling module, which adjusts the ratio of different types of data.
For quality, we resort to GPT-4 to generate high-quality data with each given data type.
arXiv Detail & Related papers (2023-08-25T01:41:04Z) - Machine Learning for Electronic Design Automation: A Survey [23.803190584543863]
With the down-scaling of CMOS technology, the design complexity of very large-scale integrated (VLSI) is increasing.
The recent breakthrough of machine learning (ML) and the increasing complexity of EDA tasks have aroused more interests in incorporating ML to solve EDA tasks.
arXiv Detail & Related papers (2021-01-10T12:54:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.