Wireless TokenCom: RL-Based Tokenizer Agreement for Multi-User Wireless Token Communications
- URL: http://arxiv.org/abs/2602.12338v1
- Date: Thu, 12 Feb 2026 19:00:33 GMT
- Title: Wireless TokenCom: RL-Based Tokenizer Agreement for Multi-User Wireless Token Communications
- Authors: Farshad Zeinali, Mahdi Boloursaz Mashhadi, Dusit Niyato, Rahim Tafazolli,
- Abstract summary: Token Communications (TokenCom) has recently emerged as an effective new paradigm, where tokens are the unified units of communications computations.<n>We investigate a multi-user downlink wireless TokenCom scenario, where the base station transmits multiple users.
- Score: 59.84545048095092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Token Communications (TokenCom) has recently emerged as an effective new paradigm, where tokens are the unified units of multimodal communications and computations, enabling efficient digital semantic- and goal-oriented communications in future wireless networks. To establish a shared semantic latent space, the transmitters/receivers in TokenCom need to agree on an identical tokenizer model and codebook. To this end, an initial Tokenizer Agreement (TA) process is carried out in each communication episode, where the transmitter/receiver cooperate to choose from a set of pre-trained tokenizer models/ codebooks available to them both for efficient TokenCom. In this correspondence, we investigate TA in a multi-user downlink wireless TokenCom scenario, where the base station equipped with multiple antennas transmits video token streams to multiple users. We formulate the corresponding mixed-integer non-convex problem, and propose a hybrid reinforcement learning (RL) framework that integrates a deep Q-network (DQN) for joint tokenizer agreement and sub-channel assignment, with a deep deterministic policy gradient (DDPG) for beamforming. Simulation results show that the proposed framework outperforms baseline methods in terms of semantic quality and resource efficiency, while reducing the freezing events in video transmission by 68% compared to the conventional H.265-based scheme.
Related papers
- Video TokenCom: Textual Intent-Guided Multi-Rate Video Token Communications with UEP-Based Adaptive Source-Channel Coding [24.169863403324314]
Token Communication (TokenCom) is a new paradigm, motivated by the recent success of Large AI Models (LAMs) and Multimodal Large Language Models (MLLMs)<n>We propose a novel Video TokenCom framework for textual intent-guided multi-rate video communication.
arXiv Detail & Related papers (2026-03-02T23:36:38Z) - Context-Aware Iterative Token Detection and Masked Transmission for Wireless Token Communication [20.850802765685145]
We propose a context-aware token communication framework that uses a shared contextual probability model between the transmitter (Tx) and receiver (Rx)<n>We introduce a context-aware masking strategy which skips highly predictable token transmission to reduce transmission rate.
arXiv Detail & Related papers (2026-01-25T10:10:51Z) - Token Communication in the Era of Large Models: An Information Bottleneck-Based Approach [55.861432910722186]
UniToCom is a unified token communication paradigm that treats tokens as the fundamental units for both processing and wireless transmission.<n>We propose a generative information bottleneck (GenIB) principle, which facilitates the learning of tokens that preserve essential information.<n>We employ a causal Transformer-based multimodal large language model (MLLM) at the receiver to unify the processing of both discrete and continuous tokens.
arXiv Detail & Related papers (2025-07-02T14:03:01Z) - Modeling and Performance Analysis for Semantic Communications Based on Empirical Results [53.805458017074294]
We propose an Alpha-Beta-Gamma (ABG) formula to model the relationship between the end-to-end measurement and SNR.<n>For image reconstruction tasks, the proposed ABG formula can well fit the commonly used DL networks, such as SCUNet, and Vision Transformer.<n>To the best of our knowledge, this is the first theoretical expression between end-to-end performance metrics and SNR for semantic communications.
arXiv Detail & Related papers (2025-04-29T06:07:50Z) - Token Communications: A Large Model-Driven Framework for Cross-modal Context-aware Semantic Communications [78.80966346820553]
We introduce token communications (TokCom), a large model-driven framework to leverage cross-modal context information in generative semantic communications (GenSC)<n>In this paper, we introduce the potential opportunities and challenges of leveraging context in GenSC, explore how to integrate GFM/MLLMs-based token processing into semantic communication systems, present the key principles for efficient TokCom at various layers in future wireless networks.
arXiv Detail & Related papers (2025-02-17T18:14:18Z) - Semantic Communication for Cooperative Perception using HARQ [51.148203799109304]
We leverage an importance map to distill critical semantic information, introducing a cooperative perception semantic communication framework.
To counter the challenges posed by time-varying multipath fading, our approach incorporates the use of frequency-division multiplexing (OFDM) along with channel estimation and equalization strategies.
We introduce a novel semantic error detection method that is integrated with our semantic communication framework in the spirit of hybrid automatic repeated request (HARQ)
arXiv Detail & Related papers (2024-08-29T08:53:26Z) - Over-the-Air Split Machine Learning in Wireless MIMO Networks [56.27831295707334]
In split machine learning (ML), different partitions of a neural network (NN) are executed by different computing nodes.
To ease communication burden, over-the-air computation (OAC) can efficiently implement all or part of the computation at the same time of communication.
arXiv Detail & Related papers (2022-10-07T15:39:11Z) - End-to-End Learning for Uplink MU-SIMO Joint Transmitter and
Non-Coherent Receiver Design in Fading Channels [11.182920270301304]
A novel end-to-end learning approach, namely JTRD-Net, is proposed for uplink multiuser single-input multiple-output (MU-SIMO) joint transmitter and non-coherent receiver design (JTRD) in fading channels.
The transmitter side is modeled as a group of parallel linear layers, which are responsible for multiuser waveform design.
The non-coherent receiver is formed by a deep feed-forward neural network (DFNN) so as to provide multiuser detection (MUD) capabilities.
arXiv Detail & Related papers (2021-05-04T02:47:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.