Related papers: Multi-View Wireless Sensing via Conditional Generative Learning: Framework and Model Design

Multi-View Wireless Sensing via Conditional Generative Learning: Framework and Model Design

URL: http://arxiv.org/abs/2505.12664v1
Date: Mon, 19 May 2025 03:27:24 GMT
Title: Multi-View Wireless Sensing via Conditional Generative Learning: Framework and Model Design
Authors: Ziqing Xing, Zhaoyang Zhang, Zirui Chen, Hongning Ruan, Zhaohui Yang,
Abstract summary: We incorporate physical knowledge into learning-based high-precision target sensing using the multi-view channel state information (CSI) between multiple base stations (BSs) and user equipment (UEs)<n>We design a bipartite neural network architecture, the first part of which uses an elaborately designed encoder to fuse the latent target features embedded in the multi-view CSI.<n>The second part uses them as conditioning inputs of a powerful generative model to guide the target's reconstruction.
Score: 16.839582722091635
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we incorporate physical knowledge into learning-based high-precision target sensing using the multi-view channel state information (CSI) between multiple base stations (BSs) and user equipment (UEs). Such kind of multi-view sensing problem can be naturally cast into a conditional generation framework. To this end, we design a bipartite neural network architecture, the first part of which uses an elaborately designed encoder to fuse the latent target features embedded in the multi-view CSI, and then the second uses them as conditioning inputs of a powerful generative model to guide the target's reconstruction. Specifically, the encoder is designed to capture the physical correlation between the CSI and the target, and also be adaptive to the numbers and positions of BS-UE pairs. Therein the view-specific nature of CSI is assimilated by introducing a spatial positional embedding scheme, which exploits the structure of electromagnetic(EM)-wave propagation channels. Finally, a conditional diffusion model with a weighted loss is employed to generate the target's point cloud from the fused features. Extensive numerical results demonstrate that the proposed generative multi-view (Gen-MV) sensing framework exhibits excellent flexibility and significant performance improvement on the reconstruction quality of target's shape and EM properties.

Related papers

AuxDet: Auxiliary Metadata Matters for Omni-Domain Infrared Small Target Detection [58.67129770371016]
We propose a novel IRSTD framework that reimagines the IRSTD paradigm by incorporating textual metadata for scene-aware optimization.<n>AuxDet consistently outperforms state-of-the-art methods, validating the critical role of auxiliary information in improving robustness and accuracy.
arXiv Detail & Related papers (2025-05-21T07:02:05Z)
Channel Fingerprint Construction for Massive MIMO: A Deep Conditional Generative Approach [65.47969413708344]
We introduce the concept of CF twins and design a conditional generative diffusion model (CGDM)<n>We employ a variational inference technique to derive the evidence lower bound (ELBO) for the log-marginal distribution of the observed fine-grained CF conditioned on the coarse-grained CF.<n>We show that the proposed approach exhibits significant improvement in reconstruction performance compared to the baselines.
arXiv Detail & Related papers (2025-05-12T01:36:06Z)
Towards Efficient Real-Time Video Motion Transfer via Generative Time Series Modeling [7.3949576464066]
We propose a deep learning framework designed to significantly optimize bandwidth for motion-transfer-enabled video applications.<n>To capture complex motion effectively, we utilize the First Order Motion Model (FOMM), which encodes dynamic objects by detecting keypoints.<n>We validate our results across three datasets for video animation and reconstruction using the following metrics: Mean Absolute Error, Joint Embedding Predictive Architecture Embedding Distance, Structural Similarity Index, and Average Pair-wise Displacement.
arXiv Detail & Related papers (2025-04-07T22:21:54Z)
PCE-GAN: A Generative Adversarial Network for Point Cloud Attribute Quality Enhancement based on Optimal Transport [56.56430888985025]
We propose a generative adversarial network for point cloud quality enhancement (PCE-GAN)<n>The generator consists of a local feature extraction (LFE) unit, a global spatial correlation (GSC) unit and a feature squeeze unit.<n>The discriminator computes the deviation between the probability distributions of the enhanced point cloud and the original point cloud, guiding the generator to achieve high quality reconstruction.
arXiv Detail & Related papers (2025-02-26T07:34:33Z)
Model-driven deep neural network for enhanced direction finding with commodity 5G gNodeB [30.94668439883861]
Current wireless networks heavily rely on pure model-driven techniques to achieve positioning functionality.<n>Here we reformulate the direction finding or angle-of-arrival (AoA) estimation problem as an image recovery task of the spatial spectrum.<n>We show that the proposed MoD-DNN framework enables effective spectrum calibration and accurate AoA estimation.
arXiv Detail & Related papers (2024-12-14T02:09:36Z)
Predicting Transonic Flowfields in Non-Homogeneous Unstructured Grids Using Autoencoder Graph Convolutional Networks [0.0]
This paper focuses on addressing challenges posed by non-homogeneous unstructured grids, commonly used in Computational Fluid Dynamics (CFD) The core of our approach centers on geometric deep learning, specifically the utilization of graph convolutional network (GCN) The novel Autoencoder GCN architecture enhances prediction accuracy by propagating information to distant nodes and emphasizing influential points.
arXiv Detail & Related papers (2024-05-07T15:18:21Z)
CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection [1.837431956557716]
We propose Cross Feature Pyramid Transformer decoder (CFPFormer), a novel decoder block that integrates feature pyramids and transformers.<n>Our work is capable of capturing long-range dependencies and effectively up-sample feature maps.<n>With a ResNet50 backbone, our method achieves 92.02% Dice Score, highlighting the efficacy of our methods.
arXiv Detail & Related papers (2024-04-23T18:46:07Z)
Agent-driven Generative Semantic Communication with Cross-Modality and Prediction [57.335922373309074]
We propose a novel agent-driven generative semantic communication framework based on reinforcement learning. In this work, we develop an agent-assisted semantic encoder with cross-modality capability, which can track the semantic changes, channel condition, to perform adaptive semantic extraction and sampling. The effectiveness of the designed models has been verified using the UA-DETRAC dataset, demonstrating the performance gains of the overall A-GSC framework.
arXiv Detail & Related papers (2024-04-10T13:24:27Z)
Multiscale Graph Neural Network Autoencoders for Interpretable Scientific Machine Learning [0.0]
The goal of this work is to address two limitations in autoencoder-based models: latent space interpretability and compatibility with unstructured meshes. This is accomplished here with the development of a novel graph neural network (GNN) autoencoding architecture with demonstrations on complex fluid flow applications.
arXiv Detail & Related papers (2023-02-13T08:47:11Z)
Vision Transformer with Convolutions Architecture Search [72.70461709267497]
We propose an architecture search method-Vision Transformer with Convolutions Architecture Search (VTCAS) The high-performance backbone network searched by VTCAS introduces the desirable features of convolutional neural networks into the Transformer architecture. It enhances the robustness of the neural network for object recognition, especially in the low illumination indoor scene.
arXiv Detail & Related papers (2022-03-20T02:59:51Z)
RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs. We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs. Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z)
Model Inspired Autoencoder for Unsupervised Hyperspectral Image Super-Resolution [25.878793557013207]
This paper focuses on hyperspectral image (HSI) super-resolution that aims to fuse a low-spatial-resolution HSI and a high-spatial-resolution multispectral image. Existing deep learning-based approaches are mostly supervised that rely on a large number of labeled training samples. We make the first attempt to design a model inspired deep network for HSI super-resolution in an unsupervised manner.
arXiv Detail & Related papers (2021-10-22T05:15:16Z)
PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving. Current approaches suffer from sparse and partial point clouds of distant and occluded objects. In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.