MoCap2Radar: A Spatiotemporal Transformer for Synthesizing Micro-Doppler Radar Signatures from Motion Capture
- URL: http://arxiv.org/abs/2511.11462v1
- Date: Fri, 14 Nov 2025 16:35:14 GMT
- Title: MoCap2Radar: A Spatiotemporal Transformer for Synthesizing Micro-Doppler Radar Signatures from Motion Capture
- Authors: Kevin Chen, Kenneth W. Parker, Anish Arora,
- Abstract summary: We present a pure machine learning process for synthesizing radar spectrograms from Motion-Capture (MoCap) data.<n>We formulate MoCap-to-spectrogram translation as a windowed sequence-to-sequence task using a transformer-based model.<n>Experiments show that the proposed approach produces visually and quantitatively plausible doppler radar spectrograms.
- Score: 1.4937905358679553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a pure machine learning process for synthesizing radar spectrograms from Motion-Capture (MoCap) data. We formulate MoCap-to-spectrogram translation as a windowed sequence-to-sequence task using a transformer-based model that jointly captures spatial relations among MoCap markers and temporal dynamics across frames. Real-world experiments show that the proposed approach produces visually and quantitatively plausible doppler radar spectrograms and achieves good generalizability. Ablation experiments show that the learned model includes both the ability to convert multi-part motion into doppler signatures and an understanding of the spatial relations between different parts of the human body. The result is an interesting example of using transformers for time-series signal processing. It is especially applicable to edge computing and Internet of Things (IoT) radars. It also suggests the ability to augment scarce radar datasets using more abundant MoCap data for training higher-level applications. Finally, it requires far less computation than physics-based methods for generating radar data.
Related papers
- RadarGen: Automotive Radar Point Cloud Generation from Cameras [64.69976771710057]
We present RadarGen, a diffusion model for synthesizing realistic automotive radar point clouds from multi-view camera imagery.<n>RadarGen adapts efficient image-latent diffusion to the radar domain by representing radar measurements in bird's-eye-view form.<n>We show that RadarGen captures characteristic radar measurement distributions and reduces the gap to perception models trained on real data.
arXiv Detail & Related papers (2025-12-19T18:57:33Z) - Simulate Any Radar: Attribute-Controllable Radar Simulation via Waveform Parameter Embedding [12.285004244174917]
SA-Radar is a radar simulation approach that enables controllable and efficient generation of radar cubes conditioned on customizable radar attributes.<n>We design ICFAR-Net, a 3D U-Net conditioned on radar attributes encoded via waveform parameters, which captures signal variations induced by different radar configurations.<n>Our framework also supports simulation in novel sensor viewpoints and edited scenes, showcasing its potential as a general-purpose radar data engine for autonomous driving applications.
arXiv Detail & Related papers (2025-06-03T17:58:28Z) - Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity.<n> multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions.<n>Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z) - EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling [69.96729022219117]
When sound waves hit an object, they induce vibrations that produce high-frequency and subtle visual changes.<n>Recent advances in event camera hardware show good potential for its application in visual sound recovery.<n>We propose a novel pipeline for non-contact sound recovery, fully utilizing spatial-temporal information from the event stream.
arXiv Detail & Related papers (2025-04-03T08:51:17Z) - Mask-RadarNet: Enhancing Transformer With Spatial-Temporal Semantic Context for Radar Object Detection in Autonomous Driving [11.221694136475554]
We propose a model called Mask-RadarNet to fully utilize the hierarchical semantic features from the input radar data.<n>Mask-RadarNet exploits the combination of interleaved convolution and attention operations to replace the traditional architecture in transformer-based models.<n>With relatively lower computational complexity and fewer parameters, the proposed Mask-RadarNet achieves higher recognition accuracy for object detection in autonomous driving.
arXiv Detail & Related papers (2024-12-20T06:39:40Z) - Radon Implicit Field Transform (RIFT): Learning Scenes from Radar Signals [9.170594803531866]
Implicit Neural Representations (INRs) offer compact and continuous representations with minimal radar data.<n>RIFT consists of two components: a classical forward model for radar and an INR based scene representation.<n>With only 10% data footprint, our RIFT model achieves up to 188% improvement in scene reconstruction.
arXiv Detail & Related papers (2024-10-16T16:59:37Z) - Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar [62.51065633674272]
We introduce Radar Fields - a neural scene reconstruction method designed for active radar imagers.
Our approach unites an explicit, physics-informed sensor model with an implicit neural geometry and reflectance model to directly synthesize raw radar measurements.
We validate the effectiveness of the method across diverse outdoor scenarios, including urban scenes with dense vehicles and infrastructure.
arXiv Detail & Related papers (2024-05-07T20:44:48Z) - Multimodal Transformers for Wireless Communications: A Case Study in
Beam Prediction [7.727175654790777]
We present a multimodal transformer deep learning framework for sensing-assisted beam prediction.
We employ a convolutional neural network to extract the features from a sequence of images, point clouds, and radar raw data sampled over time.
Experimental results show that our solution trained on image and GPS data produces the best distance-based accuracy of predicted beams at 78.44%.
arXiv Detail & Related papers (2023-09-21T06:29:38Z) - UnLoc: A Universal Localization Method for Autonomous Vehicles using
LiDAR, Radar and/or Camera Input [51.150605800173366]
UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions.
Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
arXiv Detail & Related papers (2023-07-03T04:10:55Z) - FMNet: Latent Feature-wise Mapping Network for Cleaning up Noisy
Micro-Doppler Spectrogram [2.9849405664643585]
noisy surroundings cause uninterpretable motion patterns on the micro-Doppler spectrogram.
radar returns often suffer from multipath, clutter and interference.
We propose a latent feature-wise mapping strategy, called Feature Mapping Network (FMNet), to transform measured spectrograms.
arXiv Detail & Related papers (2021-07-09T19:20:41Z) - TransMOT: Spatial-Temporal Graph Transformer for Multiple Object
Tracking [74.82415271960315]
We propose a solution named TransMOT to efficiently model the spatial and temporal interactions among objects in a video.
TransMOT is not only more computationally efficient than the traditional Transformer, but it also achieves better tracking accuracy.
The proposed method is evaluated on multiple benchmark datasets including MOT15, MOT16, MOT17, and MOT20.
arXiv Detail & Related papers (2021-04-01T01:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.