MITO: A Millimeter-Wave Dataset and Simulator for Non-Line-of-Sight Perception
- URL: http://arxiv.org/abs/2502.10259v3
- Date: Tue, 11 Mar 2025 18:31:32 GMT
- Title: MITO: A Millimeter-Wave Dataset and Simulator for Non-Line-of-Sight Perception
- Authors: Laura Dodds, Tara Boroushaki, Cusuh Ham, Fadel Adib,
- Abstract summary: We present MITO, the first millimeter-wave (mmWave) dataset of diverse, everyday objects.<n>We generate 550 high-resolution mmWave images in line-of-sight and non-light-of-sight (NLOS), as well as RGB-D images, segmentation masks, and raw mmWave signals.
- Score: 4.794643874201285
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The ability to observe the world is fundamental to reasoning and making informed decisions on how to interact with the environment. However, optical perception can often be disrupted due to common occurrences, such as occlusions, which can pose challenges to existing vision systems. We present MITO, the first millimeter-wave (mmWave) dataset of diverse, everyday objects, collected using a UR5 robotic arm with two mmWave radars operating at different frequencies and an RGB-D camera. Unlike visible light, mmWave signals can penetrate common occlusions (e.g., cardboard boxes, fabric, plastic) but each mmWave frame has much lower resolution than typical cameras. To capture higher-resolution mmWave images, we leverage the robot's mobility and fuse frames over the synthesized aperture. MITO captures over 24 million mmWave frames and uses them to generate 550 high-resolution mmWave (synthetic aperture) images in line-of-sight and non-light-of-sight (NLOS), as well as RGB-D images, segmentation masks, and raw mmWave signals, taken from 76 different objects. We develop an open-source simulation tool that can be used to generate synthetic mmWave images for any 3D triangle mesh. Finally, we demonstrate the utility of our dataset and simulator for enabling broader NLOS perception by developing benchmarks for NLOS segmentation and classification.
Related papers
- One Snapshot is All You Need: A Generalized Method for mmWave Signal Generation [15.790309349652196]
We propose mmGen, a framework tailored for full-scene mmWave signal generation.
By constructing physical signal transmission models, mmGen synthesizes human-reflected and environment-reflected mmWave signals.
We conduct extensive experiments using a prototype system with commercial mmWave devices and Kinect sensors.
arXiv Detail & Related papers (2025-03-27T03:24:10Z) - Multi-modal Multi-platform Person Re-Identification: Benchmark and Method [58.59888754340054]
MP-ReID is a novel dataset designed specifically for multi-modality and multi-platform ReID.
This benchmark compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging.
We introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios.
arXiv Detail & Related papers (2025-03-21T12:27:49Z) - Towards Weather-Robust 3D Human Body Reconstruction: Millimeter-Wave Radar-Based Dataset, Benchmark, and Multi-Modal Fusion [13.082760040398147]
3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather.<n>mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather.<n>We design ImmFusion, the first mmWave-RGB fusion solution to robustly reconstruct 3D human bodies in various weather conditions.
arXiv Detail & Related papers (2024-09-07T15:06:30Z) - Enabling Visual Recognition at Radio Frequency [13.399148413043411]
PanoRadar is a novel RF imaging system that brings RF resolution close to that of LiDAR.
Results enable, for the first time, a variety of visual recognition tasks at radio frequency.
Our results demonstrate PanoRadar's robust performance across 12 buildings.
arXiv Detail & Related papers (2024-05-29T20:52:59Z) - VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics.
In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z) - Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing [29.352303349003165]
We propose DiffSBR, a differentiable framework for mmWave-based 3D reconstruction.
DiffSBR incorporates a differentiable ray tracing engine to simulate radar point clouds from virtual 3D models.
Experiments using various radar hardware validate DiffSBR's capability for fine-grained 3D reconstruction.
arXiv Detail & Related papers (2023-11-22T06:13:39Z) - Diffusion Models for Interferometric Satellite Aperture Radar [73.01013149014865]
Probabilistic Diffusion Models (PDMs) have recently emerged as a very promising class of generative models.
Here, we leverage PDMs to generate several radar-based satellite image datasets.
We show that PDMs succeed in generating images with complex and realistic structures, but that sampling time remains an issue.
arXiv Detail & Related papers (2023-08-31T16:26:17Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - Point Cloud-based Proactive Link Quality Prediction for Millimeter-wave
Communications [2.559190942797394]
This study proposes a point cloud-based method for mmWave link quality prediction.
Our proposed method can predict future large attenuation of mmWave received signal strength and throughput.
arXiv Detail & Related papers (2023-01-02T16:51:40Z) - mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for
Millimeter Wave Radar [10.610455816814985]
Millimeter Wave (mmWave) Radar is gaining popularity as it can work in adverse environments like smoke, rain, snow, poor lighting, etc.
Prior work has explored the possibility of reconstructing 3D skeletons or meshes from the noisy and sparse mmWave Radar signals.
This dataset consists of synchronized and calibrated mmWave radar point clouds and RGB(D) images in different scenes and skeleton/mesh annotations for humans in the scenes.
arXiv Detail & Related papers (2022-09-12T08:00:31Z) - Learning 6D Pose Estimation from Synthetic RGBD Images for Robotic
Applications [0.6299766708197883]
The proposed pipeline can efficiently generate large amounts of photo-realistic RGBD images for the object of interest.
We develop a real-time two-stage 6D pose estimation approach by integrating the object detector YOLO-V4-tiny and the 6D pose estimation algorithm PVN3D.
The resulting network shows competitive performance compared to state-of-the-art methods when evaluated on LineMod dataset.
arXiv Detail & Related papers (2022-08-30T14:17:15Z) - xView3-SAR: Detecting Dark Fishing Activity Using Synthetic Aperture
Radar Imagery [52.67592123500567]
Unsustainable fishing practices worldwide pose a major threat to marine resources and ecosystems.
It is now possible to automate detection of dark vessels day or night, under all-weather conditions.
xView3-SAR consists of nearly 1,000 analysis-ready SAR images from the Sentinel-1 mission.
arXiv Detail & Related papers (2022-06-02T06:53:45Z) - Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images.
Our aim is to generate high-resolution images and videos from novel viewpoints.
We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - TUM-VIE: The TUM Stereo Visual-Inertial Event Dataset [50.8779574716494]
Event cameras are bio-inspired vision sensors which measure per pixel brightness changes.
They offer numerous benefits over traditional, frame-based cameras, including low latency, high dynamic range, high temporal resolution and low power consumption.
To foster the development of 3D perception and navigation algorithms with event cameras, we present the TUM-VIE dataset.
arXiv Detail & Related papers (2021-08-16T19:53:56Z) - Interaction-free imaging of multi-pixel objects [58.720142291102135]
Quantum imaging is well-suited to study sensitive samples which require low-light conditions, like biological tissues.
In this context, interaction-free measurements (IFM) allow us infer the presence of an opaque object without the photon interacting with the sample.
Here we extend the IFM imaging schemes to multi-pixel, semi-transparent objects, by encoding the information about the pixels into an internal degree of freedom.
arXiv Detail & Related papers (2021-06-08T06:49:19Z) - Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic
Skip Connection Network [80.67717076541956]
Under-Display Camera (UDC) systems provide a true bezel-less and notch-free viewing experience on smartphones.
In a typical UDC system, the pixel array attenuates and diffracts the incident light on the camera, resulting in significant image quality degradation.
In this work, we aim to analyze and tackle the aforementioned degradation problems.
arXiv Detail & Related papers (2021-04-19T18:41:45Z) - Generative Modelling of BRDF Textures from Flash Images [50.660026124025265]
We learn a latent space for easy capture, semantic editing, consistent, and efficient reproduction of visual material appearance.
In a second step, conditioned on the material code, our method produces an infinite and diverse spatial field of BRDF model parameters.
arXiv Detail & Related papers (2021-02-23T18:45:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.