Related papers: Vision-Aided Beam Tracking: Explore the Proper Use of Camera Images with Deep Learning

Vision-Aided Beam Tracking: Explore the Proper Use of Camera Images with Deep Learning

URL: http://arxiv.org/abs/2109.14686v1
Date: Wed, 29 Sep 2021 19:47:01 GMT
Title: Vision-Aided Beam Tracking: Explore the Proper Use of Camera Images with Deep Learning
Authors: Yu Tian, Chenwei Wang
Abstract summary: We investigate the problem of wireless beam tracking on mmWave bands with the assistance of camera images. In particular, based on the user's beam indices used and camera images taken in the trajectory, we predict the optimal beam indices in the next few time spots. We develop a deep learning approach and investigate various model components to achieve the best performance.
Score: 14.081623882445392
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We investigate the problem of wireless beam tracking on mmWave bands with the assistance of camera images. In particular, based on the user's beam indices used and camera images taken in the trajectory, we predict the optimal beam indices in the next few time spots. To resolve this problem, we first reformulate the "ViWi" dataset in [1] to get rid of the image repetition problem. Then we develop a deep learning approach and investigate various model components to achieve the best performance. Finally, we explore whether, when, and how to use the image for better beam prediction. To answer this question, we split the dataset into three clusters -- (LOS, light NLOS, serious NLOS)-like -- based on the standard deviation of the beam sequence. With experiments we demonstrate that using the image indeed helps beam tracking especially when the user is in serious NLOS, and the solution relies on carefully-designed dataset for training a model. Generally speaking, including NLOS-like data for training a model does not benefit beam tracking of the user in LOS, but including light NLOS-like data for training a model benefits beam tracking of the user in serious NLOS.

Related papers

Transientangelo: Few-Viewpoint Surface Reconstruction Using Single-Photon Lidar [8.464054039931245]
Lidar captures 3D scene geometry by emitting pulses of light to a target and recording the speed-of-light time delay of the reflected light. conventional lidar systems do not output the raw, captured waveforms of backscattered light. We develop new regularization strategies that improve robustness to photon noise, enabling accurate surface reconstruction with as few as 10 photons per pixel.
arXiv Detail & Related papers (2024-08-22T08:12:09Z)
Shelf-Supervised Cross-Modal Pre-Training for 3D Object Detection [52.66283064389691]
State-of-the-art 3D object detectors are often trained on massive labeled datasets. Recent works demonstrate that self-supervised pre-training with unlabeled data can improve detection accuracy with limited labels. We propose a shelf-supervised approach for generating zero-shot 3D bounding boxes from paired RGB and LiDAR data.
arXiv Detail & Related papers (2024-06-14T15:21:57Z)
Using Motion Cues to Supervise Single-Frame Body Pose and Shape Estimation in Low Data Regimes [93.69730589828532]
When enough annotated training data is available, supervised deep-learning algorithms excel at estimating human body pose and shape using a single camera. We show that, in such cases, easy-to-obtain unannotated videos can be used instead to provide the required supervisory signals.
arXiv Detail & Related papers (2024-02-05T05:37:48Z)
Joint 3D Shape and Motion Estimation from Rolling Shutter Light-Field Images [2.0277446818410994]
We propose an approach to address the problem of 3D reconstruction of scenes from a single image captured by a light-field camera equipped with a rolling shutter sensor. Our method leverages the 3D information cues present in the light-field and the motion information provided by the rolling shutter effect. We present a generic model for the imaging process of this sensor and a two-stage algorithm that minimizes the re-projection error.
arXiv Detail & Related papers (2023-11-02T15:08:18Z)
HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D Images [58.720142291102135]
We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment. The dataset is based on the popular Habitat simulator, in which it is possible to generate indoor scenes using both own sensor data and open datasets.
arXiv Detail & Related papers (2022-12-30T12:20:56Z)
Noise Self-Regression: A New Learning Paradigm to Enhance Low-Light Images Without Task-Related Data [86.68013790656762]
We propose Noise SElf-Regression (NoiSER) without access to any task-related data. NoiSER is highly competitive in enhancement quality, yet with a much smaller model size, and much lower training and inference cost.
arXiv Detail & Related papers (2022-11-09T06:18:18Z)
When the Sun Goes Down: Repairing Photometric Losses for All-Day Depth Estimation [47.617222712429026]
We show how to use a combination of three techniques to allow the existing photometric losses to work for both day and nighttime images. First, we introduce a per-pixel neural intensity transformation to compensate for the light changes that occur between successive frames. Second, we predict a per-pixel residual flow map that we use to correct the reprojection correspondences induced by the estimated ego-motion and depth.
arXiv Detail & Related papers (2022-06-28T09:29:55Z)
LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications. We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation. Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z)
Stereo Matching by Self-supervision of Multiscopic Vision [65.38359887232025]
We propose a new self-supervised framework for stereo matching utilizing multiple images captured at aligned camera positions. A cross photometric loss, an uncertainty-aware mutual-supervision loss, and a new smoothness loss are introduced to optimize the network. Our model obtains better disparity maps than previous unsupervised methods on the KITTI dataset.
arXiv Detail & Related papers (2021-04-09T02:58:59Z)
Learned Camera Gain and Exposure Control for Improved Visual Feature Detection and Matching [12.870196901446208]
We explore a data-driven approach to account for environmental lighting changes, improving the quality of images for use in visual odometry (VO) or visual simultaneous localization and mapping (SLAM) We train a deep convolutional neural network model to predictively adjust camera gain and exposure time parameters. We demonstrate through extensive real-world experiments that our network can anticipate and compensate for dramatic lighting changes.
arXiv Detail & Related papers (2021-02-08T16:46:09Z)
Learning Collision-Free Space Detection from Stereo Images: Homography Matrix Brings Better Data Augmentation [16.99302954185652]
It remains an open challenge to train deep convolutional neural networks (DCNNs) using only a small quantity of training samples. This paper explores an effective training data augmentation approach that can be employed to improve the overall DCNN performance.
arXiv Detail & Related papers (2020-12-14T19:14:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.