Related papers: KATO: Knowledge Alignment and Transfer for Transistor Sizing of Different Design and Technology

Related papers

Transolver-3: Scaling Up Transformer Solvers to Industrial-Scale Geometries [51.028432812178266]
Transolver-3 is a new member of the Transolver family designed for high-fidelity physics simulations.<n>We show that Transolver-3 is capable of handling meshes with over 160 million cells, achieving impressive performance across three challenging simulation benchmarks.
arXiv Detail & Related papers (2026-02-04T16:52:44Z)
STEM: Scaling Transformers with Embedding Modules [59.26825251273227]
We introduce STEM, a static, token-indexed approach that replaces the FFN up-projection with a layer-local embedding lookup.<n>This removes runtime routing, enables CPU offload with asynchronous prefetch, and decouples capacity from both per-token FLOPs and cross-device communication.<n>Overall, STEM is an effective way of scaling parametric memory while providing better interpretability, better training stability and improved efficiency.
arXiv Detail & Related papers (2026-01-15T18:00:27Z)
Data Efficient Any Transformer-to-Mamba Distillation via Attention Bridge [54.948715010753745]
State-space models (SSMs) have emerged as efficient alternatives to Transformers for sequence modeling, offering superior scalability through recurrent structures.<n>We propose Cross-architecture distillation via Attention Bridge (CAB), a novel data-efficient distillation framework that efficiently transfers attention knowledge from Transformer teachers to state-space student models.<n>Our findings suggest that attention-based knowledge can be efficiently transferred to recurrent models, enabling rapid utilization of Transformer expertise for building a stronger SSM community.
arXiv Detail & Related papers (2025-10-22T05:56:14Z)
BERT4beam: Large AI Model Enabled Generalized Beamforming Optimization [77.17508487745026]
This paper investigates the large-scale AI model designed for beamforming optimization to adapt and generalize to diverse tasks defined by system utilities and scales.<n>We propose a novel framework based on bidirectional encoder representations from transformers (BERT), termed BERT4 encoder.<n>Based on the framework, we propose two BERT-based approaches for single-task and multi-task beamforming optimization, respectively.
arXiv Detail & Related papers (2025-09-14T02:49:29Z)
Large-Scale Model Enabled Semantic Communication Based on Robust Knowledge Distillation [53.16213723669751]
Large-scale models (LSMs) can be an effective framework for semantic representation and understanding.<n>However, their direct deployment is often hindered by high computational complexity and resource requirements.<n>This paper proposes a novel knowledge distillation based semantic communication framework.
arXiv Detail & Related papers (2025-08-04T07:47:18Z)
AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation [6.2730802180534155]
AUTOCIRCUIT-RL is a novel reinforcement learning-based framework for automated analog circuit synthesis.<n>It generates 12% more valid circuits and improves efficiency by 14% compared to the best baselines.<n>It achieves over 60% success in valid circuits with limited training data, demonstrating strong generalization.
arXiv Detail & Related papers (2025-06-03T17:54:30Z)
BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN) We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations. Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z)
Joint Transmit and Pinching Beamforming for Pinching Antenna Systems (PASS): Optimization-Based or Learning-Based? [89.05848771674773]
A novel antenna system ()-enabled downlink multi-user multiple-input single-output (MISO) framework is proposed. It consists of multiple waveguides, which equip numerous low-cost antennas, named (PAs) The positions of PAs can be reconfigured to both spanning large-scale path and space.
arXiv Detail & Related papers (2025-02-12T18:54:10Z)
LLM-USO: Large Language Model-based Universal Sizing Optimizer [4.223946773134886]
We propose a novel method for knowledge representation to encode circuit design knowledge in a structured text format. This representation enables the systematic reuse of optimization insights for circuits with similar sub-structures. This approach serves to: (i) infuse domain-specific knowledge into the BO process and (ii) facilitate knowledge transfer across circuits, mirroring the cognitive strategies of expert designers.
arXiv Detail & Related papers (2025-02-04T23:08:03Z)
TransPlace: Transferable Circuit Global Placement via Graph Neural Network [24.43651668384556]
This study presents TransPlace, a global placement framework that learns to place complexity of mixed-size cells in continuous space. Compared to state-of-the-art placement methods, TransPlace-trained on a few high-quality placements-can place unseen circuits with 1.2x speedup.
arXiv Detail & Related papers (2025-01-10T02:33:15Z)
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment [3.391499691517567]
Transformer models have revolutionized AI tasks, but their large size hinders real-world deployment on resource-constrained and latency-critical edge devices. We propose a co-design method for efficient end-to-end edge deployment of Transformers from three aspects: algorithm, hardware, and joint optimization. Experimental results show our co-design achieves up to 2.14-49.37x throughput gains and 3.72-88.53x better energy efficiency over state-of-the-art Transformer accelerators.
arXiv Detail & Related papers (2024-07-16T12:36:10Z)
BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network [55.21288428359509]
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices. We propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers. Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks.
arXiv Detail & Related papers (2024-05-27T10:44:05Z)
CktGNN: Circuit Graph Neural Network for Electronic Design Automation [67.29634073660239]
This paper presents a Circuit Graph Neural Network (CktGNN) that simultaneously automates the circuit topology generation and device sizing. We introduce Open Circuit Benchmark (OCB), an open-sourced dataset that contains $10$K distinct operational amplifiers. Our work paves the way toward a learning-based open-sourced design automation for analog circuits.
arXiv Detail & Related papers (2023-08-31T02:20:25Z)
Single entanglement connection architecture between multi-layer bipartite Hardware Efficient Ansatz [18.876952671920133]
We propose a single entanglement connection architecture (SECA) for a bipartite hardware efficient ansatz. Our results indicate the superiority of SECA over the common full entanglement connection architecture (FECA) in terms of computational performance.
arXiv Detail & Related papers (2023-07-23T13:36:30Z)
Quantum chip design optimization and automation in superconducting coupler architecture [0.0]
Superconducting coupler architecture demonstrates great potential for scalable and high-performance quantum processors. How to design efficiently and automatically 'Qubit-Coupler-Qubit (QCQ)' of high performance from the layout perspective remains obscure. We acquire the crucial zero-coupling condition that is only dependent on the geometric design of the layout. We propose an optimal layout design procedure to reach the very upper bound, leading to efficient and high-performance layout design.
arXiv Detail & Related papers (2022-12-28T09:01:15Z)
Applications of Deep Learning to the Design of Enhanced Wireless Communication Systems [0.0]
Deep learning (DL)-based systems are able to handle increasingly complex tasks for which no tractable models are available. This thesis aims at comparing different approaches to unlock the full potential of DL in the physical layer.
arXiv Detail & Related papers (2022-05-02T21:02:14Z)
Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples. We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment. We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z)
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation [49.794142076551026]
Transformer-based Knowledge Distillation (TransKD) framework learns compact student transformers by distilling both feature maps and patch embeddings of large teacher transformers. Experiments on Cityscapes, ACDC, NYUv2, and Pascal VOC2012 datasets show that TransKD outperforms state-of-the-art distillation frameworks.
arXiv Detail & Related papers (2022-02-27T16:34:10Z)
Machine Learning Framework for Quantum Sampling of Highly-Constrained, Continuous Optimization Problems [101.18253437732933]
We develop a generic, machine learning-based framework for mapping continuous-space inverse design problems into surrogate unconstrained binary optimization problems. We showcase the framework's performance on two inverse design problems by optimizing thermal emitter topologies for thermophotovoltaic applications and (ii) diffractive meta-gratings for highly efficient beam steering.
arXiv Detail & Related papers (2021-05-06T02:22:23Z)
GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning [19.91205976441355]
We present GCN-RL Circuit Designer, leveraging reinforcement learning (RL) to transfer the knowledge between different technology nodes and topologies. Our learning-based optimization consistently achieves the highest Figures of Merit (FoM) on four different circuits.
arXiv Detail & Related papers (2020-04-30T17:58:07Z)
Efficient Crowd Counting via Structured Knowledge Transfer [122.30417437707759]
Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications. We propose a novel Structured Knowledge Transfer framework to generate a lightweight but still highly effective student network. Our models obtain at least 6.5$times$ speed-up on an Nvidia 1080 GPU and even achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-03-23T08:05:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.