Shared Manifold Learning Using a Triplet Network for Multiple Sensor
Translation and Fusion with Missing Data
- URL: http://arxiv.org/abs/2210.17311v1
- Date: Tue, 25 Oct 2022 20:22:09 GMT
- Title: Shared Manifold Learning Using a Triplet Network for Multiple Sensor
Translation and Fusion with Missing Data
- Authors: Aditya Dutt, Alina Zare, and Paul Gader
- Abstract summary: We propose a Contrastive learning based MultiModal Alignment Network (CoMMANet) to align data from different sensors into a shared and discriminative manifold.
The proposed architecture uses a multimodal triplet autoencoder to cluster the latent space in such a way that samples of the same classes from each heterogeneous modality are mapped close to each other.
- Score: 2.452410403088629
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Heterogeneous data fusion can enhance the robustness and accuracy of an
algorithm on a given task. However, due to the difference in various
modalities, aligning the sensors and embedding their information into
discriminative and compact representations is challenging. In this paper, we
propose a Contrastive learning based MultiModal Alignment Network (CoMMANet) to
align data from different sensors into a shared and discriminative manifold
where class information is preserved. The proposed architecture uses a
multimodal triplet autoencoder to cluster the latent space in such a way that
samples of the same classes from each heterogeneous modality are mapped close
to each other. Since all the modalities exist in a shared manifold, a unified
classification framework is proposed. The resulting latent space
representations are fused to perform more robust and accurate classification.
In a missing sensor scenario, the latent space of one sensor is easily and
efficiently predicted using another sensor's latent space, thereby allowing
sensor translation. We conducted extensive experiments on a manually labeled
multimodal dataset containing hyperspectral data from AVIRIS-NG and NEON, and
LiDAR (light detection and ranging) data from NEON. Lastly, the model is
validated on two benchmark datasets: Berlin Dataset (hyperspectral and
synthetic aperture radar) and MUUFL Gulfport Dataset (hyperspectral and LiDAR).
A comparison made with other methods demonstrates the superiority of this
method. We achieved a mean overall accuracy of 94.3% on the MUUFL dataset and
the best overall accuracy of 71.26% on the Berlin dataset, which is better than
other state-of-the-art approaches.
Related papers
- Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection [64.08296187555095]
Uni$2$Det is a framework for unified and universal multi-dataset training on 3D detection.
We introduce multi-stage prompting modules for multi-dataset 3D detection.
Results on zero-shot cross-dataset transfer validate the generalization capability of our proposed method.
arXiv Detail & Related papers (2024-09-30T17:57:50Z) - Multi-Space Alignments Towards Universal LiDAR Segmentation [50.992103482269016]
M3Net is a one-of-a-kind framework for fulfilling multi-task, multi-dataset, multi-modality LiDAR segmentation.
We first combine large-scale driving datasets acquired by different types of sensors from diverse scenes.
We then conduct alignments in three spaces, namely data, feature, and label spaces, during the training.
arXiv Detail & Related papers (2024-05-02T17:59:57Z) - DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT Sensor Data [9.531834233076934]
We propose a novel deep learning model that incorporates both Convolutional Neural Network and Bi-directional Gated Recurrent Unit to learn local and global features respectively.
In particular, the model achieves an average absolute improvement of 3.37% in Accuracy and 2.85% in F1-Score across datasets.
arXiv Detail & Related papers (2024-03-29T06:24:07Z) - MergeOcc: Bridge the Domain Gap between Different LiDARs for Robust Occupancy Prediction [8.993992124170624]
MergeOcc is developed to simultaneously handle different LiDARs by leveraging multiple datasets.
The effectiveness of MergeOcc is validated through experiments on two prominent datasets for autonomous vehicles.
arXiv Detail & Related papers (2024-03-13T13:23:05Z) - GDTM: An Indoor Geospatial Tracking Dataset with Distributed Multimodal
Sensors [9.8714071146137]
GDTM is a nine-hour dataset for multimodal object tracking with distributed multimodal sensors and reconfigurable sensor node placements.
Our dataset enables the exploration of several research problems, such as optimizing architectures for processing multimodal data.
arXiv Detail & Related papers (2024-02-21T21:24:57Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation
for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization.
We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain.
It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z) - Multimodal Remote Sensing Benchmark Datasets for Land Cover
Classification with A Shared and Specific Feature Learning Model [36.993630058695345]
We propose a shared and specific feature learning (S2FL) model to decomposing multimodal RS data into modality-shared and modality-specific components.
To better assess multimodal baselines and the newly-proposed S2FL model, three multimodal RS benchmark datasets, i.e., Houston2013 -- hyperspectral and multispectral data, Berlin -- hyperspectral and synthetic aperture radar (SAR) data, Augsburg -- hyperspectral, SAR, and digital surface model (DSM) data, are released and used for land cover classification.
arXiv Detail & Related papers (2021-05-21T08:14:21Z) - DecAug: Augmenting HOI Detection via Decomposition [54.65572599920679]
Current algorithms suffer from insufficient training samples and category imbalance within datasets.
We propose an efficient and effective data augmentation method called DecAug for HOI detection.
Experiments show that our method brings up to 3.3 mAP and 1.6 mAP improvements on V-COCO and HICODET dataset.
arXiv Detail & Related papers (2020-10-02T13:59:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.