Related papers: Chip Placement with Deep Reinforcement Learning

Chip Placement with Deep Reinforcement Learning

URL: http://arxiv.org/abs/2004.10746v1
Date: Wed, 22 Apr 2020 17:56:07 GMT
Title: Chip Placement with Deep Reinforcement Learning
Authors: Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Sungmin Bae, Azade Nazi, Jiwoo Pak, Andy Tong, Kavya Srinivasa, William Hang, Emre Tuncer, Anand Babu, Quoc V. Le, James Laudon, Richard Ho, Roger Carpenter, Jeff Dean
Abstract summary: We present a learning-based approach to chip placement. Unlike prior methods, our approach has the ability to learn from past experience and improve over time. In under 6 hours, our method can generate placements that are superhuman or comparable on modern accelerator netlists.
Score: 40.952111701288125
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we present a learning-based approach to chip placement, one of the most complex and time-consuming stages of the chip design process. Unlike prior methods, our approach has the ability to learn from past experience and improve over time. In particular, as we train over a greater number of chip blocks, our method becomes better at rapidly generating optimized placements for previously unseen chip blocks. To achieve these results, we pose placement as a Reinforcement Learning (RL) problem and train an agent to place the nodes of a chip netlist onto a chip canvas. To enable our RL policy to generalize to unseen blocks, we ground representation learning in the supervised task of predicting placement quality. By designing a neural architecture that can accurately predict reward across a wide variety of netlists and their placements, we are able to generate rich feature embeddings of the input netlists. We then use this architecture as the encoder of our policy and value networks to enable transfer learning. Our objective is to minimize PPA (power, performance, and area), and we show that, in under 6 hours, our method can generate placements that are superhuman or comparable on modern accelerator netlists, whereas existing baselines require human experts in the loop and take several weeks.

Related papers

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment [66.80143024475635]
We propose VinePPO, a straightforward approach to compute unbiased Monte Carlo-based estimates. We show that VinePPO consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets.
arXiv Detail & Related papers (2024-10-02T15:49:30Z)
Chip Placement with Diffusion Models [40.503264193497124]
We train a diffusion model capable of placing new circuits zero-shot, using guided sampling in lieu of reinforcement learning. We empirically study the design decisions of our dataset generation algorithm, and identify several key factors enabling generalization. Our models generate high-quality placements on unseen, realistic circuits, achieving competitive performance on placement benchmarks compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-07-17T03:02:24Z)
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. In this paper, we propose an adaptive scheme for action quantization. We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z)
ChiPFormer: Transferable Chip Placement via Offline Decision Transformer [35.69382855465161]
reinforcement learning can improve human performance in chip placement. ChiPFormer enables learning a transferable placement policy from fixed offline data. ChiPFormer achieves significantly better placement quality while reducing the runtime by 10x.
arXiv Detail & Related papers (2023-06-26T14:59:56Z)
MaskPlace: Fast Chip Placement via Reinforced Visual Representation Learning [18.75057105112443]
This work presents MaskPlace to automatically generate a valid chip layout design within a few hours. It recasts placement as a problem of learning pixel-level visual representation to comprehensively describe millions of modules on a chip. It outperforms recent methods that represent a chip as a hypergraph.
arXiv Detail & Related papers (2022-11-24T02:22:09Z)
Routing and Placement of Macros using Deep Reinforcement Learning [0.0]
We train a model to place the nodes of a chip netlist onto a chip canvas. We want to build a neural architecture that will accurately reward the agent across a wide variety of input netlist correctly.
arXiv Detail & Related papers (2022-05-19T02:40:58Z)
Towards Machine Learning for Placement and Routing in Chip Design: a Methodological Overview [72.79089075263985]
Placement and routing are two indispensable and challenging (NP-hard) tasks in modern chip design flows. Machine learning has shown promising prospects by its data-driven nature, which can be of less reliance on knowledge and priors.
arXiv Detail & Related papers (2022-02-28T06:28:44Z)
On Joint Learning for Solving Placement and Routing in Chip Design [70.30640973026415]
We propose a joint learning method by DeepPlace for the placement of macros and standard cells. We also develop a joint learning approach via reinforcement learning to fulfill both macro placement and routing, which is called DeepPR. Our method can effectively learn from experience and also provides intermediate placement for the post standard cell placement, within few hours for training.
arXiv Detail & Related papers (2021-10-30T11:41:49Z)
Semi-supervised Network Embedding with Differentiable Deep Quantisation [81.49184987430333]
We develop d-SNEQ, a differentiable quantisation method for network embedding. d-SNEQ incorporates a rank loss to equip the learned quantisation codes with rich high-order information. It is able to substantially compress the size of trained embeddings, thus reducing storage footprint and accelerating retrieval speed.
arXiv Detail & Related papers (2021-08-20T11:53:05Z)
Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks. specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples. We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z)
Decaying Clipping Range in Proximal Policy Optimization [0.0]
Proximal Policy Optimization (PPO) is among the most widely used algorithms in reinforcement learning. Keys to its success are the reliable policy updates through the clipping mechanism and the multiple epochs of minibatch updates. We propose linearly and exponentially decaying clipping range approaches throughout the training.
arXiv Detail & Related papers (2021-02-20T22:08:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.