Related papers: CULTURE3D: A Large-Scale and Diverse Dataset of Cultural Landmarks and Terrains for Gaussian-Based Scene Rendering

CULTURE3D: A Large-Scale and Diverse Dataset of Cultural Landmarks and Terrains for Gaussian-Based Scene Rendering

URL: http://arxiv.org/abs/2501.06927v3
Date: Wed, 09 Jul 2025 14:35:04 GMT
Title: CULTURE3D: A Large-Scale and Diverse Dataset of Cultural Landmarks and Terrains for Gaussian-Based Scene Rendering
Authors: Xinyi Zheng, Steve Zhang, Weizhe Lin, Aaron Zhang, Walterio W. Mayol-Cuevas, Yunze Liu, Junxiao Shen,
Abstract summary: Current state-of-the-art 3D reconstruction models face limitations in building extra-large scale outdoor scenes.<n>In this paper, we present a extra-large fine-grained dataset with 10 billion points composed of 41,006 drone-captured high-resolution aerial images.<n>Compared to existing datasets, ours offers significantly larger scale and higher detail, uniquely suited for fine-grained 3D applications.
Score: 12.299096433876676
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current state-of-the-art 3D reconstruction models face limitations in building extra-large scale outdoor scenes, primarily due to the lack of sufficiently large-scale and detailed datasets. In this paper, we present a extra-large fine-grained dataset with 10 billion points composed of 41,006 drone-captured high-resolution aerial images, covering 20 diverse and culturally significant scenes from worldwide locations such as Cambridge Uni main buildings, the Pyramids, and the Forbidden City Palace. Compared to existing datasets, ours offers significantly larger scale and higher detail, uniquely suited for fine-grained 3D applications. Each scene contains an accurate spatial layout and comprehensive structural information, supporting detailed 3D reconstruction tasks. By reconstructing environments using these detailed images, our dataset supports multiple applications, including outputs in the widely adopted COLMAP format, establishing a novel benchmark for evaluating state-of-the-art large-scale Gaussian Splatting methods.The dataset's flexibility encourages innovations and supports model plug-ins, paving the way for future 3D breakthroughs. All datasets and code will be open-sourced for community use.

Related papers

Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting [64.64738535860351]
We present a scalable pipeline that converts single-view images into comprehensive, scale- and appearance-realistic 3D representations.<n>Our method bridges the gap between the vast repository of imagery and the increasing demand for spatial scene understanding.<n>By automatically generating authentic, scale-aware 3D data from images, we significantly reduce data collection costs and open new avenues for advancing spatial intelligence.
arXiv Detail & Related papers (2025-07-24T14:53:26Z)
HuSc3D: Human Sculpture dataset for 3D object reconstruction [0.0]
HuSc3D is a novel dataset specifically designed for rigorous benchmarking of 3D reconstruction models under realistic acquisition challenges.<n>Our dataset features six highly detailed, fully white sculptures characterized by intricate perforations and minimal textural and color variation.
arXiv Detail & Related papers (2025-06-09T10:47:02Z)
DeepWheel: Generating a 3D Synthetic Wheel Dataset for Design and Performance Evaluation [3.3148826359547523]
This study proposes a synthetic design-performance dataset generation framework using generative AI. The framework first generates 2D rendered images using Stable Diffusion, and then reconstructs the 3D geometry through 2.5D depth estimation. The final dataset, named DeepWheel, consists of over 6,000 photo-realistic images and 900 structurally analyzed 3D models.
arXiv Detail & Related papers (2025-04-15T16:20:00Z)
Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics [50.23625950905638]
We present a new dataset for textured mesh saliency, created through an innovative eye-tracking experiment in a six degrees of freedom (6-DOF) VR environment.<n>Our proposed model predicts saliency maps for textured mesh surfaces by treating each triangular face as an individual unit and assigning a saliency density value to reflect the importance of each local surface region.
arXiv Detail & Related papers (2024-12-11T08:27:33Z)
Open-Vocabulary High-Resolution 3D (OVHR3D) Data Segmentation and Annotation Framework [1.1280113914145702]
This research aims to design and develop a comprehensive and efficient framework for 3D segmentation tasks.<n>The framework integrates Grounding DINO and Segment anything Model, augmented by an enhancement in 2D image rendering via 3D mesh.
arXiv Detail & Related papers (2024-12-09T07:39:39Z)
GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory. Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images. GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z)
MegaScenes: Scene-Level View Synthesis at Scale [69.21293001231993]
Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics applications. We create a large-scale scene-level dataset from Internet photo collections, called MegaScenes, which contains over 100K structure from motion (SfM) reconstructions from around the world. We analyze failure cases of state-of-the-art NVS methods and significantly improve generation consistency.
arXiv Detail & Related papers (2024-06-17T17:55:55Z)
Zero-Shot Multi-Object Scene Completion [59.325611678171974]
We present a 3D scene completion method that recovers the complete geometry of multiple unseen objects in complex scenes from a single RGB-D image. Our method outperforms the current state-of-the-art on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-03-21T17:59:59Z)
Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive Experience [28.651514326042648]
We have built a custom mobile multi-camera large-space dense light field capture system. Our aim is to contribute to the development of popular 3D scene reconstruction algorithms. The collected dataset is much denser than existing datasets.
arXiv Detail & Related papers (2024-03-15T02:39:44Z)
GauU-Scene: A Scene Reconstruction Benchmark on Large Scale 3D Reconstruction Dataset Using Gaussian Splatting [5.968501319323899]
We introduce a novel large-scale scene reconstruction benchmark using the newly developed 3D representation approach, Gaussian Splatting. U-Scene encompasses over one and a half square kilometres, featuring a comprehensive RGB dataset and LiDAR ground truth. This dataset, offers a unique blend of urban and academic environments for advanced spatial analysis.
arXiv Detail & Related papers (2024-01-25T09:22:32Z)
Pyramid Diffusion for Fine 3D Large Scene Generation [56.00726092690535]
Diffusion models have shown remarkable results in generating 2D images and small-scale 3D objects. Their application to the synthesis of large-scale 3D scenes has been rarely explored. We introduce a framework, the Pyramid Discrete Diffusion model (PDD), which employs scale-varied diffusion models to progressively generate high-quality outdoor scenes.
arXiv Detail & Related papers (2023-11-20T11:24:21Z)
Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z)
AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space. We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z)
UniG3D: A Unified 3D Object Generation Dataset [75.49544172927749]
UniG3D is a unified 3D object generation dataset constructed by employing a universal data transformation pipeline on ShapeNet datasets. This pipeline converts each raw 3D model into comprehensive multi-modal data representation. The selection of data sources for our dataset is based on their scale and quality.
arXiv Detail & Related papers (2023-06-19T07:03:45Z)
MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices [78.20154723650333]
High-quality 3D ground-truth shapes are critical for 3D object reconstruction evaluation. We introduce a novel multi-view RGBD dataset captured using a mobile device. We obtain precise 3D ground-truth shape without relying on high-end 3D scanners.
arXiv Detail & Related papers (2023-03-03T14:02:50Z)
ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes [32.267066838654834]
We present a novel implicit representation to efficiently and accurately encode large datasets of complex 3D shapes. Our implicit Recursive Octree Auto-Decoder (ROAD) learns a hierarchically structured latent space enabling state-of-the-art reconstruction results at a compression ratio above 99%.
arXiv Detail & Related papers (2022-12-12T19:09:47Z)
A Real World Dataset for Multi-view 3D Reconstruction [28.298548207213468]
We present a dataset of 371 3D models of everyday tabletop objects along with their 320,000 real world RGB and depth images. We primarily focus on learned multi-view 3D reconstruction due to the lack of appropriate real world benchmark for the task and demonstrate that our dataset can fill that gap.
arXiv Detail & Related papers (2022-03-22T00:15:54Z)
Ground material classification and for UAV-based photogrammetric 3D data A 2D-3D Hybrid Approach [1.3359609092684614]
In recent years, photogrammetry has been widely used in many areas to create 3D virtual data representing the physical environment. These cutting-edge technologies have caught the US Army and Navy's attention for the purpose of rapid 3D battlefield reconstruction, virtual training, and simulations.
arXiv Detail & Related papers (2021-09-24T22:29:26Z)
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges [52.624157840253204]
We present an urban-scale photogrammetric point cloud dataset with nearly three billion richly annotated points. Our dataset consists of large areas from three UK cities, covering about 7.6 km2 of the city landscape. We evaluate the performance of state-of-the-art algorithms on our dataset and provide a comprehensive analysis of the results.
arXiv Detail & Related papers (2020-09-07T14:47:07Z)
HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures [39.2984574045825]
This dataset has 6,300 real-world panoramas that are accurately aligned with a CAD model downtown London with an area of 20 km2 times. The ultimate goal of this dataset is to support real applications for city reconstruction, mapping, and augmented reality.
arXiv Detail & Related papers (2020-08-07T17:34:47Z)
LayoutMP3D: Layout Annotation of Matterport3D [59.11106101006007]
We consider the Matterport3D dataset with their originally provided depth map ground truths and further release our annotations for layout ground truths from a subset of Matterport3D. Our dataset provides both the layout and depth information, which enables the opportunity to explore the environment by integrating both cues.
arXiv Detail & Related papers (2020-03-30T14:40:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.