UVG-VPC: Voxelized Point Cloud Dataset for Visual Volumetric Video-based Coding
- URL: http://arxiv.org/abs/2504.05888v1
- Date: Tue, 08 Apr 2025 10:27:53 GMT
- Title: UVG-VPC: Voxelized Point Cloud Dataset for Visual Volumetric Video-based Coding
- Authors: Guillaume Gautier, Alexandre Mercat, Louis Fréneau, Mikko Pitkänen, Jarno Vanne,
- Abstract summary: This paper presents a new open dataset called UVG-VPC for the development, evaluation, and validation of MPEG Visual Volumetric Video-based Coding (V3C) technology.<n>The dataset is distributed under its own non-commercial license.
- Score: 42.999580283729614
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Point cloud compression has become a crucial factor in immersive visual media processing and streaming. This paper presents a new open dataset called UVG-VPC for the development, evaluation, and validation of MPEG Visual Volumetric Video-based Coding (V3C) technology. The dataset is distributed under its own non-commercial license. It consists of 12 point cloud test video sequences of diverse characteristics with respect to the motion, RGB texture, 3D geometry, and surface occlusion of the points. Each sequence is 10 seconds long and comprises 250 frames captured at 25 frames per second. The sequences are voxelized with a geometry precision of 9 to 12 bits, and the voxel color attributes are represented as 8-bit RGB values. The dataset also includes associated normals that make it more suitable for evaluating point cloud compression solutions. The main objective of releasing the UVG-VPC dataset is to foster the development of V3C technologies and thereby shape the future in this field.
Related papers
- ViVo: A Dataset for Volumetric Video Reconstruction and Compression [13.827241444266308]
We propose a new dataset, ViVo, for VolumetrIc VideO reconstruction and compression.<n>The dataset is faithful to real-world volumetric video production and is the first dataset to extend the definition of diversity.<n>To demonstrate the use of this database, we have benchmarked three state-of-the-art 3-D reconstruction methods and two volumetric video compression algorithms.
arXiv Detail & Related papers (2025-05-31T13:30:21Z) - Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration [56.75198775820637]
Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems.<n>To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes.<n>Our dataset provides precisely aligned point clouds and bounding box annotations across 10 classes, ensuring reliable data for perception training.
arXiv Detail & Related papers (2025-02-19T23:53:00Z) - Color Enhancement for V-PCC Compressed Point Cloud via 2D Attribute Map Optimization [8.21390074063036]
Video-based point cloud compression (V-PCC) converts the dynamic point cloud data into video sequences.<n>This paper introduces a framework designed to enhance the color quality in the V-PCC compressed point clouds.
arXiv Detail & Related papers (2024-12-19T01:58:00Z) - BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression [14.109939177281069]
BVI-CR contains 18 multi-view RGB-D captures and their corresponding textured polygonal meshes.
Each video sequence contains 10 views in 1080p resolution with durations between 10-15 seconds at 30FPS.
Results show the great potential of neural representation based methods in volumetric video compression.
arXiv Detail & Related papers (2024-11-17T23:22:48Z) - ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection [51.16181295385818]
We first collect an annotated RGB-D video SODOD (DSOD-100) dataset, which contains 100 videos within a total of 9,362 frames.
All the frames in each video are manually annotated to a high-quality saliency annotation.
We propose a new baseline model, named attentive triple-fusion network (ATF-Net) for RGB-D salient object detection.
arXiv Detail & Related papers (2024-06-18T12:09:43Z) - 4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional
Networks [29.615723135027096]
We propose a new solution for upscaling and restoration of time-varying 3D video point clouds after they have been compressed.
Our model consists of a specifically designed Graph Convolutional Network (GCN) that combines Dynamic Edge Convolution and Graph Attention Networks.
arXiv Detail & Related papers (2023-06-01T18:43:16Z) - Towards Scalable Neural Representation for Diverse Videos [68.73612099741956]
Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images.
Existing INR-based methods are limited to encoding a handful of short videos with redundant visual content.
This paper focuses on developing neural representations for encoding long and/or a large number of videos with diverse visual content.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - GQE-Net: A Graph-based Quality Enhancement Network for Point Cloud Color
Attribute [51.4803148196217]
We propose a graph-based quality enhancement network (GQE-Net) to reduce color distortion in point clouds.
GQE-Net uses geometry information as an auxiliary input and graph convolution blocks to extract local features efficiently.
Experimental results show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-03-24T02:33:45Z) - Efficient dynamic point cloud coding using Slice-Wise Segmentation [10.850101961203748]
MPEG recently developed a video-based point cloud compression (V-PCC) standard for dynamic point cloud coding.
Patch generations and self-occluded points in the 3D to the 2D projection are the main reasons for missing data using V-PCC.
This paper proposes a new method that introduces overlapping slicing to decrease the number of patches generated and the amount of data lost.
arXiv Detail & Related papers (2022-08-17T04:23:45Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.