CircuitNet: An Open-Source Dataset for Machine Learning Applications in
Electronic Design Automation (EDA)
- URL: http://arxiv.org/abs/2208.01040v2
- Date: Thu, 4 Aug 2022 08:15:56 GMT
- Title: CircuitNet: An Open-Source Dataset for Machine Learning Applications in
Electronic Design Automation (EDA)
- Authors: Zhuomin Chai, Yuxiang Zhao, Yibo Lin, Wei Liu, Runsheng Wang, Ru Huang
- Abstract summary: We present the first open-source dataset for machine learning tasks in VLSI CAD called CircuitNet.
The dataset consists of more than 10K samples extracted from versatile runs of commercial design tools based on 6 open-source RISC-V designs.
- Score: 9.788869757486289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The electronic design automation (EDA) community has been actively exploring
machine learning for very-large-scale-integrated computer aided design (VLSI
CAD). Many studies have explored learning based techniques for cross-stage
prediction tasks in the design flow to achieve faster design convergence.
Although building machine learning (ML) models usually requires a large amount
of data, most studies can only generate small internal datasets for validation
due to the lack of large public datasets. In this essay, we present the first
open-source dataset for machine learning tasks in VLSI CAD called CircuitNet.
The dataset consists of more than 10K samples extracted from versatile runs of
commercial design tools based on 6 open-source RISC-V designs.
Related papers
- PERC: a suite of software tools for the curation of cryoEM data with application to simulation, modelling and machine learning [0.3818645814949463]
In structural biology there are now numerous open repositories of experimental and simulated datasets.
The tools presented here are useful for collating existing public cryoEM datasets and/or creating new synthetic cryoEM datasets.
arXiv Detail & Related papers (2025-03-17T16:07:56Z) - BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement [45.19076032719869]
We present BlenderLLM, a framework for training Large Language Models (LLMs) in Computer-Aided Design (CAD)
Our results reveal that existing models demonstrate significant limitations in generating accurate CAD scripts.
Through minimal instruction-based fine-tuning and iterative self-improvement, BlenderLLM significantly surpasses these models in both functionality and accuracy of CAD script generation.
arXiv Detail & Related papers (2024-12-16T14:34:02Z) - CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation [51.2289822267563]
We propose Corpus Retrieval and Augmentation for Fine-Tuning (CRAFT), a method for generating synthetic datasets.
We use large-scale public web-crawled corpora and similarity-based document retrieval to find other relevant human-written documents.
We demonstrate that CRAFT can efficiently generate large-scale task-specific training datasets for four diverse tasks.
arXiv Detail & Related papers (2024-09-03T17:54:40Z) - NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks.
arXiv Detail & Related papers (2024-07-18T17:59:01Z) - Full-stack evaluation of Machine Learning inference workloads for RISC-V systems [0.2621434923709917]
This study evaluates the performance of a wide array of machine learning workloads on RISC-V architectures using gem5, an open-source architectural simulator.
Leveraging an open-source compilation toolchain based on Multi-Level Intermediate Representation (MLIR), the research presents benchmarking results specifically focused on deep learning inference workloads.
arXiv Detail & Related papers (2024-05-24T09:24:46Z) - EDALearn: A Comprehensive RTL-to-Signoff EDA Benchmark for Democratized
and Reproducible ML for EDA Research [5.093676641214663]
We introduce EDALearn, the first holistic, open-source benchmark suite specifically for Machine Learning tasks in EDA.
This benchmark suite presents an end-to-end flow from synthesis to physical implementation, enriching data collection across various stages.
Our contributions aim to encourage further advances in the ML-EDA domain.
arXiv Detail & Related papers (2023-12-04T06:51:46Z) - A Deep Learning Framework for Verilog Autocompletion Towards Design and
Verification Automation [0.33598755777055367]
This paper proposes a novel deep learning framework for training a Verilog autocompletion model.
The framework involves integrating models pretrained on general programming language data and finetuning them on a dataset curated to be similar to a target downstream task.
Experiments demonstrate that the proposed framework achieves better BLEU, ROUGE-L, and chrF scores by 9.5%, 6.7%, and 6.9%, respectively, compared to a model trained from scratch.
arXiv Detail & Related papers (2023-04-26T21:56:03Z) - AutoTransfer: AutoML with Knowledge Transfer -- An Application to Graph
Neural Networks [75.11008617118908]
AutoML techniques consider each task independently from scratch, leading to high computational cost.
Here we propose AutoTransfer, an AutoML solution that improves search efficiency by transferring the prior architectural design knowledge to the novel task of interest.
arXiv Detail & Related papers (2023-03-14T07:23:16Z) - Pre-Training for Robots: Offline RL Enables Learning New Tasks from a
Handful of Trials [97.95400776235736]
We present a framework based on offline RL that attempts to effectively learn new tasks.
It combines pre-training on existing robotic datasets with rapid fine-tuning on a new task, with as few as 10 demonstrations.
To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot.
arXiv Detail & Related papers (2022-10-11T06:30:53Z) - Design Automation for Fast, Lightweight, and Effective Deep Learning
Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing.
It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs.
The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Designing Machine Learning Toolboxes: Concepts, Principles and Patterns [0.0]
We provide an overview of key patterns in the design of AI modeling toolboxes.
Our analysis can not only explain the design of existing toolboxes, but also guide the development of new ones.
arXiv Detail & Related papers (2021-01-13T08:55:15Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.