VOMTC: Vision Objects for Millimeter and Terahertz Communications
- URL: http://arxiv.org/abs/2409.09330v1
- Date: Sat, 14 Sep 2024 06:18:51 GMT
- Title: VOMTC: Vision Objects for Millimeter and Terahertz Communications
- Authors: Sunwoo Kim, Yongjun Ahn, Daeyoung Park, Byonghyo Shim,
- Abstract summary: We propose a large-scale vision dataset referred to as Vision Objects for Millimeter and Terahertz Communications (VOMTC)
The VOMTC dataset consists of 20,232 pairs of RGB and depth images obtained from a camera attached to the base station (BS)
We show that the beamforming technique exploiting the VOMTC-trained object detector outperforms conventional beamforming techniques.
- Score: 29.670122146586614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in sensing and computer vision (CV) technologies have opened the door for the application of deep learning (DL)-based CV technologies in the realm of 6G wireless communications. For the successful application of this emerging technology, it is crucial to have a qualified vision dataset tailored for wireless applications (e.g., RGB images containing wireless devices such as laptops and cell phones). An aim of this paper is to propose a large-scale vision dataset referred to as Vision Objects for Millimeter and Terahertz Communications (VOMTC). The VOMTC dataset consists of 20,232 pairs of RGB and depth images obtained from a camera attached to the base station (BS), with each pair labeled with three representative object categories (person, cell phone, and laptop) and bounding boxes of the objects. Through experimental studies of the VOMTC datasets, we show that the beamforming technique exploiting the VOMTC-trained object detector outperforms conventional beamforming techniques.
Related papers
- PCB-Vision: A Multiscene RGB-Hyperspectral Benchmark Dataset of Printed
Circuit Boards [11.658030498915535]
'PCB-Vision' is a pioneering RGB-hyperspectral printed circuit board (PCB) benchmark dataset, comprising 53 RGB images of high spatial resolution paired with their corresponding high spectral resolution hyperspectral data cubes in the visible and near-infrared (VNIR) range.
We provide extensive statistical investigations on the proposed dataset together with the performance of several state-of-the-art (SOTA) models, including U-Net, Attention U-Net, Residual U-Net, LinkNet, and DeepLabv3+.
arXiv Detail & Related papers (2024-01-12T12:00:26Z) - Federated Multi-View Synthesizing for Metaverse [52.59476179535153]
The metaverse is expected to provide immersive entertainment, education, and business applications.
Virtual reality (VR) transmission over wireless networks is data- and computation-intensive.
We have developed a novel multi-view synthesizing framework that can efficiently provide synthesizing, storage, and communication resources for wireless content delivery in the metaverse.
arXiv Detail & Related papers (2023-12-18T13:51:56Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - Vehicle Cameras Guide mmWave Beams: Approach and Real-World V2V
Demonstration [13.117333069558812]
Accurately aligning millimeter-wave (mmWave) and terahertz (THz) narrow beams is essential to satisfy reliability and high data rates of 5G and beyond wireless communication systems.
We develop a deep learning solution for V2V scenarios to predict future beams using images from a 360 camera attached to the vehicle.
arXiv Detail & Related papers (2023-08-20T20:43:11Z) - DensePose From WiFi [86.61881052177228]
We develop a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions.
Our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches.
arXiv Detail & Related papers (2022-12-31T16:48:43Z) - A Strong Transfer Baseline for RGB-D Fusion in Vision Transformers [0.0]
We propose a recipe for transferring pretrained ViTs in RGB-D domains for single-view 3D object recognition.
We show that our adapted ViTs score up to 95.1% top-1 accuracy in Washington, achieving new state-of-the-art results in this benchmark.
arXiv Detail & Related papers (2022-10-03T12:08:09Z) - Global Context Vision Transformers [78.5346173956383]
We propose global context vision transformer (GC ViT), a novel architecture that enhances parameter and compute utilization for computer vision.
We address the lack of the inductive bias in ViTs, and propose to leverage a modified fused inverted residual blocks in our architecture.
Our proposed GC ViT achieves state-of-the-art results across image classification, object detection and semantic segmentation tasks.
arXiv Detail & Related papers (2022-06-20T18:42:44Z) - A Survey on RGB-D Datasets [69.73803123972297]
This paper reviewed and categorized image datasets that include depth information.
We gathered 203 datasets that contain accessible data and grouped them into three categories: scene/objects, body, and medical.
arXiv Detail & Related papers (2022-01-15T05:35:19Z) - Network-Aware 5G Edge Computing for Object Detection: Augmenting
Wearables to "See'' More, Farther and Faster [18.901994926291465]
This paper presents a detailed simulation and evaluation of 5G wireless offloading for object detection within a powerful, new smart wearable called VIS4ION.
The current VIS4ION system is an instrumented book-bag with high-resolution cameras, vision processing and haptic and audio feedback.
The paper considers uploading the camera data to a mobile edge cloud to perform real-time object detection and transmitting the detection results back to the wearable.
arXiv Detail & Related papers (2021-12-25T07:09:00Z) - Applying Deep-Learning-Based Computer Vision to Wireless Communications:
Methodologies, Opportunities, and Challenges [100.45137961106069]
Deep learning (DL) has seen great success in the computer vision (CV) field.
This article introduces ideas about applying DL-based CV in wireless communications.
arXiv Detail & Related papers (2020-06-10T11:37:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.