Related papers: A Hybrid Approach combining ANN-based and Conventional Demapping in Communication for Efficient FPGA-Implementation

A Hybrid Approach combining ANN-based and Conventional Demapping in Communication for Efficient FPGA-Implementation

URL: http://arxiv.org/abs/2304.05042v1
Date: Tue, 11 Apr 2023 07:58:01 GMT
Title: A Hybrid Approach combining ANN-based and Conventional Demapping in Communication for Efficient FPGA-Implementation
Authors: Jonas Ney, Bilal Hammoud, Norbert Wehn
Abstract summary: Autoencoder (AE) refers to the concept of replacing parts of the transmitter and receiver by artificial neural networks (ANNs) We propose a novel approach for efficient ANN-based remapping on FPGAs, which combines the adaptability of the AE with the efficiency of conventional demapping algorithms. Our work opens a door for the practical application of ANN-based communication algorithms on FPGAs.
Score: 6.072680828922663
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In communication systems, Autoencoder (AE) refers to the concept of replacing parts of the transmitter and receiver by artificial neural networks (ANNs) to train the system end-to-end over a channel model. This approach aims to improve communication performance, especially for varying channel conditions, with the cost of high computational complexity for training and inference. Field-programmable gate arrays (FPGAs) have been shown to be a suitable platform for energy-efficient ANN implementation. However, the high number of operations and the large model size of ANNs limit the performance on resource-constrained devices, which is critical for low latency and high-throughput communication systems. To tackle his challenge, we propose a novel approach for efficient ANN-based remapping on FPGAs, which combines the adaptability of the AE with the efficiency of conventional demapping algorithms. After adaption to channel conditions, the channel characteristics, implicitly learned by the ANN, are extracted to enable the use of optimized conventional demapping algorithms for inference. We validate the hardware efficiency of our approach by providing FPGA implementation results and by comparing the communication performance to that of conventional systems. Our work opens a door for the practical application of ANN-based communication algorithms on FPGAs.

Related papers

FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review [3.7810245817090906]
Convolutional Neural Networks (CNNs) are fundamental to deep learning, driving applications across various domains.<n>This paper provides a comprehensive review of FPGA-based hardware accelerators specifically designed for CNNs.
arXiv Detail & Related papers (2025-05-04T04:03:37Z)
Communication-Efficient Federated Learning by Quantized Variance Reduction for Heterogeneous Wireless Edge Networks [55.467288506826755]
Federated learning (FL) has been recognized as a viable solution for local-privacy-aware collaborative model training in wireless edge networks. Most existing communication-efficient FL algorithms fail to reduce the significant inter-device variance. We propose a novel communication-efficient FL algorithm, named FedQVR, which relies on a sophisticated variance-reduced scheme.
arXiv Detail & Related papers (2025-01-20T04:26:21Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with the Gemmini Accelerator [0.5714074111744111]
We present and end-to-end workflow for deployment of CNNs on Field Programmable Gate Arrays (FPGAs) using the Gemmini accelerator. We were able to achieve real-time performance by deploying a YOLOv7 model on a Xilinx ZCU102 FPGA with an energy efficiency of 36.5 GOP/s/W.
arXiv Detail & Related papers (2024-08-14T09:24:00Z)
CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture [6.081142345739704]
We present a high-performance FPGA implementation of an ANN-based equalizer, which meets the throughput requirements of modern optical communication systems. The implementation is based on a cross-layer design approach featuring optimizations from the algorithm down to the hardware architecture. The corresponding FPGA implementation achieves a throughput of more than 40 GBd, outperforming a high-performance graphics processing unit.
arXiv Detail & Related papers (2024-04-22T09:13:47Z)
Reconfigurable Distributed FPGA Cluster Design for Deep Learning Accelerators [59.11160990637615]
We propose a distributed system based on lowpower embedded FPGAs designed for edge computing applications. The proposed system can simultaneously execute diverse Neural Network (NN) models, arrange the graph in a pipeline structure, and manually allocate greater resources to the most computationally intensive layers of the NN graph.
arXiv Detail & Related papers (2023-05-24T16:08:55Z)
Deep Deterministic Policy Gradient for End-to-End Communication Systems without Prior Channel Knowledge [8.48741007380969]
End-to-End (E2E) learning-based concept has been recently introduced to jointly optimize both the transmitter and the receiver in wireless communication systems. This paper aims to solve this issue by developing a deep deterministic policy gradient (DDPG)-based framework.
arXiv Detail & Related papers (2023-05-12T13:05:32Z)
Unsupervised ANN-Based Equalizer and Its Trainable FPGA Implementation [5.487336551142519]
We present a novel ANN-based, unsupervised equalizer and its trainable field programmable gate array (FPGA) implementation. As a first step towards a practical communication system, we design an efficient FPGA implementation of our proposed algorithm, which achieves a throughput in the order of Gbit/s.
arXiv Detail & Related papers (2023-04-14T08:17:05Z)
FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems [62.20308752994373]
We propose a new smart network interface card (NIC) for distributed AI training systems using field-programmable gate arrays (FPGAs) Our proposed FPGA-based AI smart NIC enhances overall training performance by 1.6x at 6 nodes, with an estimated 2.5x performance improvement at 32 nodes, compared to the baseline system using conventional NICs.
arXiv Detail & Related papers (2022-04-22T21:57:00Z)
Optimal Power Allocation for Rate Splitting Communications with Deep Reinforcement Learning [61.91604046990993]
This letter introduces a novel framework to optimize the power allocation for users in a Rate Splitting Multiple Access network. In the network, messages intended for users are split into different parts that are a single common part and respective private parts.
arXiv Detail & Related papers (2021-07-01T06:32:49Z)
HALF: Holistic Auto Machine Learning for FPGAs [1.9146960682777232]
Deep Neural Networks (DNNs) are capable of solving complex problems in domains related to embedded systems, such as image and natural language processing. To efficiently implement DNNs on a specific FPGA platform for a given cost criterion, e.g. energy efficiency, an enormous amount of design parameters has to be considered. An automatic, holistic design approach can improve the quality of DNN implementations on FPGA significantly.
arXiv Detail & Related papers (2021-06-28T14:45:47Z)
Data-Driven Random Access Optimization in Multi-Cell IoT Networks with NOMA [78.60275748518589]
Non-orthogonal multiple access (NOMA) is a key technology to enable massive machine type communications (mMTC) in 5G networks and beyond. In this paper, NOMA is applied to improve the random access efficiency in high-density spatially-distributed multi-cell wireless IoT networks. A novel formulation of random channel access management is proposed, in which the transmission probability of each IoT device is tuned to maximize the geometric mean of users' expected capacity.
arXiv Detail & Related papers (2021-01-02T15:21:08Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.