MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception
- URL: http://arxiv.org/abs/2403.11496v1
- Date: Mon, 18 Mar 2024 06:00:38 GMT
- Title: MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception
- Authors: Thien-Minh Nguyen, Shenghai Yuan, Thien Hoang Nguyen, Pengyu Yin, Haozhi Cao, Lihua Xie, Maciej Wozniak, Patric Jensfelt, Marko Thiel, Justin Ziegenbein, Noel Blunder,
- Abstract summary: We introduce a comprehensive dataset named MCD, featuring a wide range of sensing modalities, high-accuracy ground truth, and diverse challenging environments.
MCD comprises both CCS (Classical Cylindrical Spinning) and NRE (Non-Repetitive Epicyclic) lidars, high-quality IMUs (Inertial Measurement Units), cameras, and UWB (Ultra-WideBand) sensors.
In a pioneering effort, we introduce semantic annotations of 29 classes over 59k sparse NRE lidar scans across three domains, thus providing a novel challenge to existing semantic segmentation research.
- Score: 24.67682960590225
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Perception plays a crucial role in various robot applications. However, existing well-annotated datasets are biased towards autonomous driving scenarios, while unlabelled SLAM datasets are quickly over-fitted, and often lack environment and domain variations. To expand the frontier of these fields, we introduce a comprehensive dataset named MCD (Multi-Campus Dataset), featuring a wide range of sensing modalities, high-accuracy ground truth, and diverse challenging environments across three Eurasian university campuses. MCD comprises both CCS (Classical Cylindrical Spinning) and NRE (Non-Repetitive Epicyclic) lidars, high-quality IMUs (Inertial Measurement Units), cameras, and UWB (Ultra-WideBand) sensors. Furthermore, in a pioneering effort, we introduce semantic annotations of 29 classes over 59k sparse NRE lidar scans across three domains, thus providing a novel challenge to existing semantic segmentation research upon this largely unexplored lidar modality. Finally, we propose, for the first time to the best of our knowledge, continuous-time ground truth based on optimization-based registration of lidar-inertial data on large survey-grade prior maps, which are also publicly released, each several times the size of existing ones. We conduct a rigorous evaluation of numerous state-of-the-art algorithms on MCD, report their performance, and highlight the challenges awaiting solutions from the research community.
Related papers
- RADAR: Robust Two-stage Modality-incomplete Industrial Anomaly Detection [61.71770293720491]
We propose a novel two-stage Robust modAlity-imcomplete fusing and Detecting frAmewoRk, abbreviated as RADAR.
Our bootstrapping philosophy is to enhance two stages in MIIAD, improving the robustness of the Multimodal Transformer.
Our experimental results demonstrate that the proposed RADAR significantly surpasses conventional MIAD methods in terms of effectiveness and robustness.
arXiv Detail & Related papers (2024-10-02T16:47:55Z) - Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection [64.08296187555095]
Uni$2$Det is a framework for unified and universal multi-dataset training on 3D detection.
We introduce multi-stage prompting modules for multi-dataset 3D detection.
Results on zero-shot cross-dataset transfer validate the generalization capability of our proposed method.
arXiv Detail & Related papers (2024-09-30T17:57:50Z) - M3BAT: Unsupervised Domain Adaptation for Multimodal Mobile Sensing with Multi-Branch Adversarial Training [5.128670847334003]
multimodal mobile sensing has been used extensively for inferences regarding health and well being, behavior, and context.
The distribution of data in the training set differs from the distribution of data in the real world, the deployment environment.
We propose M3BAT, an unsupervised domain adaptation for multimodal mobile sensing with multi-branch adversarial training.
arXiv Detail & Related papers (2024-04-26T13:09:35Z) - RethinkingTMSC: An Empirical Study for Target-Oriented Multimodal
Sentiment Classification [70.9087014537896]
Target-oriented Multimodal Sentiment Classification (TMSC) has gained significant attention among scholars.
To investigate the causes of this problem, we perform extensive empirical evaluation and in-depth analysis of the datasets.
arXiv Detail & Related papers (2023-10-14T14:52:37Z) - Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset
and Comprehensive Framework [51.44863255495668]
Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence.
We present Multi-Modal Reasoning(COCO-MMR) dataset, a novel dataset that encompasses an extensive collection of open-ended questions.
We propose innovative techniques, including multi-hop cross-modal attention and sentence-level contrastive learning, to enhance the image and text encoders.
arXiv Detail & Related papers (2023-07-24T08:58:25Z) - Perturbation-Based Two-Stage Multi-Domain Active Learning [31.073745612552926]
We propose a perturbation-based two-stage multi-domain active learning (P2S-MDAL) method incorporated into the well-regarded ASP-MTL model.
P2S-MDAL involves allocating budgets for domains and establishing regions for diversity selection.
A perturbation metric has been introduced to evaluate the robustness of the shared feature extractor of the model.
arXiv Detail & Related papers (2023-06-19T04:58:32Z) - GAMUS: A Geometry-aware Multi-modal Semantic Segmentation Benchmark for
Remote Sensing Data [27.63411386396492]
This paper introduces a new benchmark dataset for multi-modal semantic segmentation based on RGB-Height (RGB-H) data.
The proposed benchmark consists of 1) a large-scale dataset including co-registered RGB and nDSM pairs and pixel-wise semantic labels; 2) a comprehensive evaluation and analysis of existing multi-modal fusion strategies for both convolutional and Transformer-based networks on remote sensing data.
arXiv Detail & Related papers (2023-05-24T09:03:18Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset:
Collection, Insights and Improvements [14.707930573950787]
We present MuSe-CaR, a first of its kind multimodal dataset.
The data is publicly available as it recently served as the testing bed for the 1st Multimodal Sentiment Analysis Challenge.
arXiv Detail & Related papers (2021-01-15T10:40:37Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.