Related papers: DevFormer: A Symmetric Transformer for Context-Aware Device Placement

DevFormer: A Symmetric Transformer for Context-Aware Device Placement

URL: http://arxiv.org/abs/2205.13225v3
Date: Wed, 7 Jun 2023 07:01:45 GMT
Title: DevFormer: A Symmetric Transformer for Context-Aware Device Placement
Authors: Haeyeon Kim, Minsu Kim, Federico Berto, Joungho Kim, Jinkyoo Park
Abstract summary: We present DevFormer, a transformer-based architecture for addressing the complex and computationally demanding problem of hardware design optimization. Our approach addresses this limitation by introducing strong inductive biases such as relative positional embeddings and action-permutation symmetricity. We show that DevFoemer outperforms state-of-the-art methods in both simulated hardware, leading to improved performances while reducing the number of components by more than $30$%.
Score: 12.400790776196667
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we present DevFormer, a novel transformer-based architecture for addressing the complex and computationally demanding problem of hardware design optimization. Despite the demonstrated efficacy of transformers in domains including natural language processing and computer vision, their use in hardware design has been limited by the scarcity of offline data. Our approach addresses this limitation by introducing strong inductive biases such as relative positional embeddings and action-permutation symmetricity that effectively capture the hardware context and enable efficient design optimization with limited offline data. We apply DevFoemer to the problem of decoupling capacitor placement and show that it outperforms state-of-the-art methods in both simulated and real hardware, leading to improved performances while reducing the number of components by more than $30\%$. Finally, we show that our approach achieves promising results in other offline contextual learning-based combinatorial optimization tasks.

Related papers

Efficient and Accurate Scene Text Recognition with Cascaded-Transformers [11.638859439061164]
We propose an efficient and accurate Scene Text Recognition system. We focus on improving the efficiency of encoder models by introducing a cascaded-transformers structure. Our experimental results confirm that our STR system achieves comparable performance to state-of-the-art baselines.
arXiv Detail & Related papers (2025-03-24T16:58:37Z)
ContextFormer: Redefining Efficiency in Semantic Segmentation [48.81126061219231]
Convolutional methods, although capturing local dependencies well, struggle with long-range relationships. Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands. We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation.
arXiv Detail & Related papers (2025-01-31T16:11:04Z)
Transformer^-1: Input-Adaptive Computation for Resource-Constrained Deployment [3.6219999155937113]
This paper proposes a Transformer$-1$ architecture to address the resource waste caused by fixed computation paradigms in deep learning models under dynamic scenarios. In a benchmark test, our method reduces FLOPs by 42.7% and peak memory usage by 3% compared to the standard Transformer. We also conducted experiments on several natural language processing tasks and achieved significant improvements in resource efficiency.
arXiv Detail & Related papers (2025-01-26T15:31:45Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment [3.391499691517567]
Transformer models have revolutionized AI tasks, but their large size hinders real-world deployment on resource-constrained and latency-critical edge devices. We propose a co-design method for efficient end-to-end edge deployment of Transformers from three aspects: algorithm, hardware, and joint optimization. Experimental results show our co-design achieves up to 2.14-49.37x throughput gains and 3.72-88.53x better energy efficiency over state-of-the-art Transformer accelerators.
arXiv Detail & Related papers (2024-07-16T12:36:10Z)
Learning for Semantic Knowledge Base-Guided Online Feature Transmission in Dynamic Channels [41.59960455142914]
We propose an online optimization framework to address the challenge of dynamic channel conditions and device mobility in an end-to-end communication system. Our approach builds upon existing methods by leveraging a semantic knowledge base to drive multi-level feature transmission. To solve the online optimization problem, we design a novel soft actor-critic-based deep reinforcement learning system with a carefully designed reward function for real-time decision-making.
arXiv Detail & Related papers (2023-11-30T07:35:56Z)
TransCODE: Co-design of Transformers and Accelerators for Efficient Training and Inference [6.0093441900032465]
We propose a framework that simulates transformer inference and training on a design space of accelerators. We use this simulator in conjunction with the proposed co-design technique, called TransCODE, to obtain the best-performing models. The obtained transformer-accelerator pair achieves 0.3% higher accuracy than the state-of-the-art pair.
arXiv Detail & Related papers (2023-03-27T02:45:18Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications. The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate. There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z)
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression [69.36555801766762]
We propose a hardware-aware tensor decomposition framework, dubbed HEAT, that enables efficient exploration of the exponential space of possible decompositions. We experimentally show that our hardware-aware factorized BERT variants reduce the energy-delay product by 5.7x with less than 1.1% accuracy loss.
arXiv Detail & Related papers (2022-11-30T05:31:45Z)
DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation [56.514462874501675]
We propose a dynamic sparse attention based Transformer model to achieve fine-level matching with favorable efficiency. The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on. Experiments on three applications, pose-guided person image generation, edge-based face synthesis, and undistorted image style transfer, demonstrate that DynaST achieves superior performance in local details.
arXiv Detail & Related papers (2022-07-13T11:12:03Z)
CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning. The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery. The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
Optimizing Inference Performance of Transformers on CPUs [0.0]
Transformers-based models (e.g., BERT) power many important Web services, such as search, translation, question-answering, etc. This paper presents an empirical analysis of scalability and performance of inferencing a Transformer-based model on CPUs.
arXiv Detail & Related papers (2021-02-12T17:01:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.