Related papers: WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Large-scale Natural Environments

WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Large-scale Natural Environments

URL: http://arxiv.org/abs/2312.15364v1
Date: Sat, 23 Dec 2023 22:27:40 GMT
Title: WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Large-scale Natural Environments
Authors: Kavisha Vidanapathirana, Joshua Knights, Stephen Hausler, Mark Cox, Milad Ramezani, Jason Jooste, Ethan Griffiths, Shaheer Mohamed, Sridha Sridharan, Clinton Fookes and Peyman Moghadam
Abstract summary: We introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scales in natural environments. The data is trajectory-centric with accurate localization and globally aligned point clouds. We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques.
Score: 34.24004079703609
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent progress in semantic scene understanding has primarily been enabled by the availability of semantically annotated bi-modal (camera and lidar) datasets in urban environments. However, such annotated datasets are also needed for natural, unstructured environments to enable semantic perception for applications, including conservation, search and rescue, environment monitoring, and agricultural automation. Therefore, we introduce WildScenes, a bi-modal benchmark dataset consisting of multiple large-scale traversals in natural environments, including semantic annotations in high-resolution 2D images and dense 3D lidar point clouds, and accurate 6-DoF pose information. The data is (1) trajectory-centric with accurate localization and globally aligned point clouds, (2) calibrated and synchronized to support bi-modal inference, and (3) containing different natural environments over 6 months to support research on domain adaptation. Our 3D semantic labels are obtained via an efficient automated process that transfers the human-annotated 2D labels from multiple views into 3D point clouds, thus circumventing the need for expensive and time-consuming human annotation in 3D. We introduce benchmarks on 2D and 3D semantic segmentation and evaluate a variety of recent deep-learning techniques to demonstrate the challenges in semantic segmentation in natural environments. We propose train-val-test splits for standard benchmarks as well as domain adaptation benchmarks and utilize an automated split generation technique to ensure the balance of class label distributions. The data, evaluation scripts and pretrained models will be released upon acceptance at https://csiro-robotics.github.io/WildScenes.

Related papers

3D Can Be Explored In 2D: Pseudo-Label Generation for LiDAR Point Clouds Using Sensor-Intensity-Based 2D Semantic Segmentation [3.192308005611312]
We introduce a new 3D semantic segmentation pipeline that leverages aligned scenes and state-of-the-art 2D segmentation methods.<n>Our approach generates 2D views from LiDAR scans colored by sensor intensity and applies 2D semantic segmentation to these views.<n>The segmented 2D outputs are then back-projected onto the 3D points, with a simple voting-based estimator.
arXiv Detail & Related papers (2025-05-06T08:31:32Z)
Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding [59.51535163599723]
FreeGS is an unsupervised semantic-embedded 3DGS framework that achieves view-consistent 3D scene understanding without the need for 2D labels. We show that FreeGS performs comparably to state-of-the-art methods while avoiding the complex data preprocessing workload.
arXiv Detail & Related papers (2024-11-29T08:52:32Z)
Syn-to-Real Unsupervised Domain Adaptation for Indoor 3D Object Detection [50.448520056844885]
We propose a novel framework for syn-to-real unsupervised domain adaptation in indoor 3D object detection. Our adaptation results from synthetic dataset 3D-FRONT to real-world datasets ScanNetV2 and SUN RGB-D demonstrate remarkable mAP25 improvements of 9.7% and 9.1% over Source-Only baselines.
arXiv Detail & Related papers (2024-06-17T08:18:41Z)
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations. We leverage the strong semantic prior within a 3D generative model to train a semantic decoder. Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z)
Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization. We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain. It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z)
SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from Point Cloud [125.9472454212909]
We present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D) SSDA3D includes an Inter-domain Adaptation stage and an Intra-domain Generalization stage. Experiments show that, with only 10% labeled target data, our SSDA3D can surpass the fully-supervised oracle model with 100% target label.
arXiv Detail & Related papers (2022-12-06T09:32:44Z)
3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling [37.315964084413174]
We develop a domain adaptation framework via generating reliable pseudo ground truths of depth from real data to provide direct supervisions. Specifically, we propose two mechanisms for pseudo-labeling: 1) 2D-based pseudo-labels via measuring the consistency of depth predictions when images are with the same content but different styles; 2) 3D-aware pseudo-labels via a point cloud completion network that learns to complete the depth values in the 3D space.
arXiv Detail & Related papers (2022-09-19T17:54:17Z)
Collaborative Propagation on Multiple Instance Graphs for 3D Instance Segmentation with Single-point Supervision [63.429704654271475]
We propose a novel weakly supervised method RWSeg that only requires labeling one object with one point. With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information. Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs.
arXiv Detail & Related papers (2022-08-10T02:14:39Z)
Ego2HandsPose: A Dataset for Egocentric Two-hand 3D Global Pose Estimation [0.0]
Ego2HandsPose is the first dataset that enables color-based two-hand 3D tracking in unseen domains. We develop a set of parametric fitting algorithms to enable 1) 3D hand pose annotation using a single image, 2) automatic conversion from 2D to 3D hand poses and 3) accurate two-hand tracking with temporal consistency.
arXiv Detail & Related papers (2022-06-10T07:50:45Z)
Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation. It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences. We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z)
H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [4.263987603222371]
This paper introduces a 3D dataset which is unique in three ways. It depicts the village of Hessigheim (Germany) henceforth referred to as H3D. It is designed for promoting research in the field of 3D data analysis on one hand and to evaluate and rank emerging approaches.
arXiv Detail & Related papers (2021-02-10T09:33:48Z)
Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes [36.07733308424772]
The deficiency of 3D segmentation labels is one of the main obstacles to effective point cloud segmentation. We propose a novel deep graph convolutional network-based framework for large-scale semantic scene segmentation in point clouds with sole 2D supervision.
arXiv Detail & Related papers (2020-04-26T23:02:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.