Human Silhouette and Skeleton Video Synthesis through Wi-Fi signals
- URL: http://arxiv.org/abs/2203.05864v1
- Date: Fri, 11 Mar 2022 11:40:34 GMT
- Title: Human Silhouette and Skeleton Video Synthesis through Wi-Fi signals
- Authors: Danilo Avola, Marco Cascio, Luigi Cinque, Alessio Fagioli and Gian
Luca Foresti
- Abstract summary: This paper presents a novel two-branch generative neural network that effectively maps radio data into visual features.
Once trained, the proposed method synthesizes human silhouette and skeleton videos using exclusively Wi-Fi signals.
- Score: 24.313281453214614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing availability of wireless access points (APs) is leading
towards human sensing applications based on Wi-Fi signals as support or
alternative tools to the widespread visual sensors, where the signals enable to
address well-known vision-related problems such as illumination changes or
occlusions. Indeed, using image synthesis techniques to translate radio
frequencies to the visible spectrum can become essential to obtain otherwise
unavailable visual data. This domain-to-domain translation is feasible because
both objects and people affect electromagnetic waves, causing radio and optical
frequencies variations. In literature, models capable of inferring
radio-to-visual features mappings have gained momentum in the last few years
since frequency changes can be observed in the radio domain through the channel
state information (CSI) of Wi-Fi APs, enabling signal-based feature extraction,
e.g., amplitude. On this account, this paper presents a novel two-branch
generative neural network that effectively maps radio data into visual
features, following a teacher-student design that exploits a cross-modality
supervision strategy. The latter conditions signal-based features in the visual
domain to completely replace visual data. Once trained, the proposed method
synthesizes human silhouette and skeleton videos using exclusively Wi-Fi
signals. The approach is evaluated on publicly available data, where it obtains
remarkable results for both silhouette and skeleton videos generation,
demonstrating the effectiveness of the proposed cross-modality supervision
strategy.
Related papers
- ViFi-ReID: A Two-Stream Vision-WiFi Multimodal Approach for Person Re-identification [3.3743041904085125]
Person re-identification (ReID) plays a vital role in safety inspections, personnel counting, and more.
Most current ReID approaches primarily extract features from images, which are easily affected by objective conditions.
We leverage widely available routers as sensing devices by capturing gait information from pedestrians through the Channel State Information (CSI) in WiFi signals.
arXiv Detail & Related papers (2024-10-13T15:34:11Z) - Physical-Layer Semantic-Aware Network for Zero-Shot Wireless Sensing [74.12670841657038]
Device-free wireless sensing has recently attracted significant interest due to its potential to support a wide range of immersive human-machine interactive applications.
Data heterogeneity in wireless signals and data privacy regulation of distributed sensing have been considered as the major challenges that hinder the wide applications of wireless sensing in large area networking systems.
We propose a novel zero-shot wireless sensing solution that allows models constructed in one or a limited number of locations to be directly transferred to other locations without any labeled data.
arXiv Detail & Related papers (2023-12-08T13:50:30Z) - DensePose From WiFi [86.61881052177228]
We develop a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions.
Our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches.
arXiv Detail & Related papers (2022-12-31T16:48:43Z) - Multi-task Learning Approach for Modulation and Wireless Signal
Classification for 5G and Beyond: Edge Deployment via Model Compression [1.218340575383456]
Future communication networks must address the scarce spectrum to accommodate growth of heterogeneous wireless devices.
We exploit the potential of deep neural networks based multi-task learning framework to simultaneously learn modulation and signal classification tasks.
We provide a comprehensive heterogeneous wireless signals dataset for public use.
arXiv Detail & Related papers (2022-02-26T14:51:02Z) - Self-Supervised Radio-Visual Representation Learning for 6G Sensing [1.9766522384767227]
In future 6G cellular networks, a joint communication and sensing protocol will allow the network to perceive the environment.
We propose to combine radio and vision to automatically learn a radio-only sensing model with minimal human intervention.
arXiv Detail & Related papers (2021-11-01T12:23:47Z) - Unsupervised Person Re-Identification with Wireless Positioning under
Weak Scene Labeling [131.18390399368997]
We propose to explore unsupervised person re-identification with both visual data and wireless positioning trajectories under weak scene labeling.
Specifically, we propose a novel unsupervised multimodal training framework (UMTF), which models the complementarity of visual data and wireless information.
Our UMTF contains a multimodal data association strategy (MMDA) and a multimodal graph neural network (MMGN)
arXiv Detail & Related papers (2021-10-29T08:25:44Z) - Audio-visual Representation Learning for Anomaly Events Detection in
Crowds [119.72951028190586]
This paper attempts to exploit multi-modal learning for modeling the audio and visual signals simultaneously.
We conduct the experiments on SHADE dataset, a synthetic audio-visual dataset in surveillance scenes.
We find introducing audio signals effectively improves the performance of anomaly events detection and outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-10-28T02:42:48Z) - Integrating Sensing and Communication in Cellular Networks via NR
Sidelink [7.42576783544779]
We discuss a common issue related to sidelink-based RF-sensing, which is its angle and rotation dependence.
We propose a graph based encoder to capture propose-temporal features of the data and four approaches for multi-angle learning.
arXiv Detail & Related papers (2021-09-15T12:41:31Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - Vision Meets Wireless Positioning: Effective Person Re-identification
with Recurrent Context Propagation [120.18969251405485]
Existing person re-identification methods rely on the visual sensor to capture the pedestrians.
Mobile phone can be sensed by WiFi and cellular networks in the form of a wireless positioning signal.
We propose a novel recurrent context propagation module that enables information to propagate between visual data and wireless positioning data.
arXiv Detail & Related papers (2020-08-10T14:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.