Related papers: Conquering High Packet-Loss Erasure: MoE Swin Transformer-Based Video Semantic Communication

Conquering High Packet-Loss Erasure: MoE Swin Transformer-Based Video Semantic Communication

URL: http://arxiv.org/abs/2508.01205v1
Date: Sat, 02 Aug 2025 05:41:52 GMT
Title: Conquering High Packet-Loss Erasure: MoE Swin Transformer-Based Video Semantic Communication
Authors: Lei Teng, Senran Fan, Chen Dong, Haotai Liang, Zhicheng Bao, Xiaodong Xu, Rui Meng, Ping Zhang,
Abstract summary: packet-loss-resistant MoE Swin Transformer-based Video Semantic Communication (MSTVSC) system is proposed in this paper.<n>To address this issue, a packet-loss-resistant MoE Swin Transformer-based Video Semantic Communication (MSTVSC) system is proposed in this paper.
Score: 11.845717685362814
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic communication with joint semantic-channel coding robustly transmits diverse data modalities but faces challenges in mitigating semantic information loss due to packet drops in packet-based systems. Under current protocols, packets with errors are discarded, preventing the receiver from utilizing erroneous semantic data for robust decoding. To address this issue, a packet-loss-resistant MoE Swin Transformer-based Video Semantic Communication (MSTVSC) system is proposed in this paper. Semantic vectors are encoded by MSTVSC and transmitted through upper-layer protocol packetization. To investigate the impact of the packetization, a theoretical analysis of the packetization strategy is provided. To mitigate the semantic loss caused by packet loss, a 3D CNN at the receiver recovers missing information using un-lost semantic data and an packet-loss mask matrix. Semantic-level interleaving is employed to reduce concentrated semantic loss from packet drops. To improve compression, a common-individual decomposition approach is adopted, with downsampling applied to individual information to minimize redundancy. The model is lightweighted for practical deployment. Extensive simulations and comparisons demonstrate strong performance, achieving an MS-SSIM greater than 0.6 and a PSNR exceeding 20 dB at a 90% packet loss rate.

Related papers

Send Less, Perceive More: Masked Quantized Point Cloud Communication for Loss-Tolerant Collaborative Perception [38.10779821259225]
We introduce QPoint2Comm, a quantized point-cloud communication framework that dramatically reduces bandwidth.<n>QPoint2Comm directly communicates quantized point-cloud indices using a shared codebook.<n>We employ a masked training strategy that simulates random packet loss, allowing the model to maintain strong performance even under severe transmission failures.
arXiv Detail & Related papers (2026-02-25T08:00:48Z)
Large Speech Model Enabled Semantic Communication [58.027223937172955]
Large Speech Model enabled Semantic Communication (LargeSC) system.<n>We exploit the rich semantic knowledge embedded in large models and enable adaptive transmission over lossy channels.<n>System supports bandwidths ranging from 550 bps to 2.06 kbps, outperforms conventional baselines in speech quality under high packet loss rates.
arXiv Detail & Related papers (2025-12-04T11:58:08Z)
Prediction-Powered Communication with Distortion Guarantees [65.37485275954224]
We study a prediction-powered communication setting, in which devices communicate under zero-delay constraints with strict distortion guarantees.<n>We propose two zero-delay compression algorithms leveraging online conformal prediction to provide per-sequence guarantees on the distortion of reconstructed sequences.<n>Experiments on semantic text compression validate the approach, showing significant bit rate reductions.
arXiv Detail & Related papers (2025-09-29T07:19:39Z)
Generative Feature Imputing - A Technique for Error-resilient Semantic Communication [46.46641562787869]
This paper proposes a novel framework, termed generative feature imputing, which comprises three key techniques.<n>First, we introduce a spatial error concentration packetization strategy that spatially concentrates feature distortions by encoding feature elements based on their channel mappings.<n>Second, we propose a generative feature imputing method that utilizes a diffusion model to efficiently reconstruct missing features caused by packet losses.<n>Third, we develop a semantic-aware power allocation scheme that enables unequal error protection by allocating transmission power according to the semantic importance of each packet.
arXiv Detail & Related papers (2025-08-25T12:19:48Z)
Distributed Training under Packet Loss [8.613477072763404]
Leveraging unreliable connections will reduce latency but may sacrifice model accuracy and convergence once packets are dropped.<n>We introduce a principled, end-to-end solution that preserves accuracy and convergence guarantees under genuine packet loss.<n>This work bridges the gap between communication-efficient protocols and the accuracy and guarantees demanded by modern large-model training.
arXiv Detail & Related papers (2025-07-02T11:07:20Z)
MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification [59.96233305733875]
Classifying traffic is essential for detecting security threats and optimizing network management.<n>We propose a Multi-Instance Encrypted Traffic Transformer (MIETT) to capture both token-level and packet-level relationships.<n>MIETT achieves results across five datasets, demonstrating its effectiveness in classifying encrypted traffic and understanding complex network behaviors.
arXiv Detail & Related papers (2024-12-19T12:52:53Z)
Synchronous Multi-modal Semantic Communication System with Packet-level Coding [20.397350999784276]
We propose a Synchronous Multimodal Semantic Communication System (SyncSC) with Packet-Level Coding. To achieve semantic and time synchronization, 3D Morphable Mode (3DMM) coefficients and text are transmitted as semantics. To protect semantic packets under the erasure channel, we propose a packet-Level Forward Error Correction (FEC) method, called PacSC, that maintains a certain visual quality performance even at high packet loss rates.
arXiv Detail & Related papers (2024-08-08T15:42:00Z)
Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints [66.63250537475973]
This paper introduces a diffusion-driven semantic communication framework with advanced VAE-based compression for bandwidth-constrained generative model.<n>Our experimental results demonstrate significant improvements in pixel-level metrics like peak signal to noise ratio (PSNR) and semantic metrics like learned perceptual image patch similarity (LPIPS)
arXiv Detail & Related papers (2024-07-26T02:34:25Z)
Secure Semantic Communication via Paired Adversarial Residual Networks [59.468221305630784]
This letter explores the positive side of the adversarial attack for the security-aware semantic communication system. A pair of matching pluggable modules is installed: one after the semantic transmitter and the other before the semantic receiver. The proposed scheme is capable of fooling the eavesdropper while maintaining the high-quality semantic communication.
arXiv Detail & Related papers (2024-07-02T08:32:20Z)
Latent Diffusion Model-Enabled Low-Latency Semantic Communication in the Presence of Semantic Ambiguities and Wireless Channel Noises [18.539501941328393]
This paper develops a latent diffusion model-enabled SemCom system to handle outliers in source data.<n>A lightweight single-layer latent space transformation adapter completes one-shot learning at the transmitter.<n>An end-to-end consistency distillation strategy is used to distill the diffusion models trained in latent space.
arXiv Detail & Related papers (2024-06-09T23:39:31Z)
A Transformer-Based Framework for Payload Malware Detection and Classification [0.0]
Techniques such as Deep Packet Inspection (DPI) have been introduced to allow IDSs analyze the content of network packets. In this paper, we propose a revolutionary DPI algorithm based on transformers adapted for the purpose of detecting malicious traffic.
arXiv Detail & Related papers (2024-03-27T03:25:45Z)
Spatiotemporal Attention-based Semantic Compression for Real-time Video Recognition [117.98023585449808]
We propose a temporal attention-based autoencoder (STAE) architecture to evaluate the importance of frames and pixels in each frame. We develop a lightweight decoder that leverages a 3D-2D CNN combined to reconstruct missing information. Experimental results show that ViT_STAE can compress the video dataset H51 by 104x with only 5% accuracy loss.
arXiv Detail & Related papers (2023-05-22T07:47:27Z)
Is Semantic Communications Secure? A Tale of Multi-Domain Adversarial Attacks [70.51799606279883]
We introduce test-time adversarial attacks on deep neural networks (DNNs) for semantic communications. We show that it is possible to change the semantics of the transferred information even when the reconstruction loss remains low.
arXiv Detail & Related papers (2022-12-20T17:13:22Z)
Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks. We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation. We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z)
Packet-Loss-Tolerant Split Inference for Delay-Sensitive Deep Learning in Lossy Wireless Networks [4.932130498861988]
In distributed inference, computational tasks are offloaded from the IoT device to other devices or the edge server via lossy IoT networks. narrow-band and lossy IoT networks cause non-negligible packet losses and retransmissions, resulting in non-negligible communication latency. We propose a split inference with no retransmissions (SI-NR) method that achieves high accuracy without any retransmissions, even when packet loss occurs.
arXiv Detail & Related papers (2021-04-28T08:28:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.