Token Communication in the Era of Large Models: An Information Bottleneck-Based Approach
- URL: http://arxiv.org/abs/2507.01728v1
- Date: Wed, 02 Jul 2025 14:03:01 GMT
- Title: Token Communication in the Era of Large Models: An Information Bottleneck-Based Approach
- Authors: Hao Wei, Wanli Ni, Wen Wang, Wenjun Xu, Dusit Niyato, Ping Zhang,
- Abstract summary: UniToCom is a unified token communication paradigm that treats tokens as the fundamental units for both processing and wireless transmission.<n>We propose a generative information bottleneck (GenIB) principle, which facilitates the learning of tokens that preserve essential information.<n>We employ a causal Transformer-based multimodal large language model (MLLM) at the receiver to unify the processing of both discrete and continuous tokens.
- Score: 55.861432910722186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This letter proposes UniToCom, a unified token communication paradigm that treats tokens as the fundamental units for both processing and wireless transmission. Specifically, to enable efficient token representations, we propose a generative information bottleneck (GenIB) principle, which facilitates the learning of tokens that preserve essential information while supporting reliable generation across multiple modalities. By doing this, GenIB-based tokenization is conducive to improving the communication efficiency and reducing computational complexity. Additionally, we develop $\sigma$-GenIB to address the challenges of variance collapse in autoregressive modeling, maintaining representational diversity and stability. Moreover, we employ a causal Transformer-based multimodal large language model (MLLM) at the receiver to unify the processing of both discrete and continuous tokens under the next-token prediction paradigm. Simulation results validate the effectiveness and superiority of the proposed UniToCom compared to baselines under dynamic channel conditions. By integrating token processing with MLLMs, UniToCom enables scalable and generalizable communication in favor of multimodal understanding and generation, providing a potential solution for next-generation intelligent communications.
Related papers
- Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality [29.531450446701175]
This paper argues that token reduction should transcend its traditional efficiency-oriented role in the era of large generative models.<n>We argue that token reduction can facilitate deeper multimodal integration and alignment, maintain coherence over long inputs, and enhance training stability.<n>We outline promising future directions, including algorithm design, reinforcement learning-guided token reduction, token optimization for in-context learning, and broader ML and scientific domains.
arXiv Detail & Related papers (2025-05-23T11:30:30Z) - Enhancing Latent Computation in Transformers with Latent Tokens [48.371764897314]
Augmenting large language models with auxiliary tokens has emerged as a promising strategy for enhancing model performance.<n>We introduce a lightweight method termed latent tokens; these are dummy tokens that may be non-interpretable in natural language.<n>The proposed latent tokens can be seamlessly integrated with a pre-trained Transformer, trained in a parameter-efficient manner, and applied flexibly at inference time.
arXiv Detail & Related papers (2025-05-19T02:35:53Z) - Token Communication-Driven Multimodal Large Models in Resource-Constrained Multiuser Networks [7.137830911253685]
multimodal large models pose challenges for deploying intelligent applications at the wireless edge.<n>These constraints manifest as limited bandwidth, computational capacity, and stringent latency requirements.<n>We propose a token communication paradigm that facilitates decentralized proliferations across user devices and edge infrastructure.
arXiv Detail & Related papers (2025-05-06T14:17:05Z) - Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications [78.80966346820553]
We introduce token communications (TokCom), a large model-driven framework to leverage cross-modal context information in generative semantic communications (GenSC)<n>In this paper, we introduce the potential opportunities and challenges of leveraging context in GenSC, explore how to integrate GFM/MLLMs-based token processing into semantic communication systems, present the key principles for efficient TokCom at various layers in future wireless networks.
arXiv Detail & Related papers (2025-02-17T18:14:18Z) - Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens [57.37893387775829]
We introduce a fast and balanced clustering method, named Semantic Equitable Clustering (SEC)<n>SEC clusters tokens based on their global semantic relevance in an efficient, straightforward manner.<n>We propose a versatile vision backbone, SECViT, to serve as a vision language connector.
arXiv Detail & Related papers (2024-05-22T04:49:00Z) - Adaptive Semantic Token Selection for AI-native Goal-oriented Communications [11.92172357956248]
We propose a novel design for AI-native goal-oriented communications.
We exploit transformer neural networks under dynamic inference constraints on bandwidth and computation.
We show that our model improves over state-of-the-art token selection mechanisms.
arXiv Detail & Related papers (2024-04-25T13:49:50Z) - Large AI Model Empowered Multimodal Semantic Communications [48.73159237649128]
We propose a Large AI Model-based Multimodal SC (LAMMSC) framework.
We first present the Conditional-based Multimodal Alignment (MMA) that enables the transformation between multimodal and unimodal data.
Then, a personalized LLM-based Knowledge Base (LKB) is proposed, which allows users to perform personalized semantic extraction or recovery.
Finally, we apply the Generative adversarial network-based channel Estimation (CGE) for estimating the wireless channel state information.
arXiv Detail & Related papers (2023-09-03T19:24:34Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.