Camera-LiDAR Fusion with Latent Contact for Place Recognition in
Challenging Cross-Scenes
- URL: http://arxiv.org/abs/2310.10371v1
- Date: Mon, 16 Oct 2023 13:06:55 GMT
- Title: Camera-LiDAR Fusion with Latent Contact for Place Recognition in
Challenging Cross-Scenes
- Authors: Yan Pan, Jiapeng Xie, Jiajie Wu, Bo Zhou
- Abstract summary: This paper introduces a novel three-channel place descriptor, which consists of a cascade of image, point cloud, and fusion branches.
Experiments on the KITTI, NCLT, USVInland, and the campus dataset demonstrate that the proposed place descriptor stands as the state-of-the-art approach.
- Score: 5.957306851772919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although significant progress has been made, achieving place recognition in
environments with perspective changes, seasonal variations, and scene
transformations remains challenging. Relying solely on perception information
from a single sensor is insufficient to address these issues. Recognizing the
complementarity between cameras and LiDAR, multi-modal fusion methods have
attracted attention. To address the information waste in existing multi-modal
fusion works, this paper introduces a novel three-channel place descriptor,
which consists of a cascade of image, point cloud, and fusion branches.
Specifically, the fusion-based branch employs a dual-stage pipeline, leveraging
the correlation between the two modalities with latent contacts, thereby
facilitating information interaction and fusion. Extensive experiments on the
KITTI, NCLT, USVInland, and the campus dataset demonstrate that the proposed
place descriptor stands as the state-of-the-art approach, confirming its
robustness and generality in challenging scenarios.
Related papers
- Joint Fusion and Encoding: Advancing Multimodal Retrieval from the Ground Up [26.32353412029717]
Information retrieval is indispensable for today's Internet applications.
Traditional semantic matching techniques often fall short in capturing fine-grained cross-modal interactions.
We introduce a unified retrieval framework that fuses visual and textual cues from the ground up.
arXiv Detail & Related papers (2025-02-27T11:41:55Z) - Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer [78.35816158511523]
We present a single-stage emotion recognition approach, employing a Decoupled Subject-Context Transformer (DSCT) for simultaneous subject localization and emotion classification.
We evaluate our single-stage framework on two widely used context-aware emotion recognition datasets, CAER-S and EMOTIC.
arXiv Detail & Related papers (2024-04-26T07:30:32Z) - From Text to Pixels: A Context-Aware Semantic Synergy Solution for
Infrared and Visible Image Fusion [66.33467192279514]
We introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images.
Our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.
arXiv Detail & Related papers (2023-12-31T08:13:47Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - Multimodal Hyperspectral Image Classification via Interconnected Fusion [12.41850641917384]
An Interconnected Fusion (IF) framework is proposed to explore the relationships across HSI and LiDAR modalities comprehensively.
Experiments have been conducted on three widely used datasets: Trento, MUUFL, and Houston.
arXiv Detail & Related papers (2023-04-02T09:46:13Z) - FER-former: Multi-modal Transformer for Facial Expression Recognition [14.219492977523682]
A novel multifarious supervision-steering Transformer for Facial Expression Recognition is proposed in this paper.
Our approach features multi-granularity embedding integration, hybrid self-attention scheme, and heterogeneous domain-steering supervision.
Experiments on popular benchmarks demonstrate the superiority of the proposed FER-former over the existing state-of-the-arts.
arXiv Detail & Related papers (2023-03-23T02:29:53Z) - CLIP-Driven Fine-grained Text-Image Person Re-identification [50.94827165464813]
TIReID aims to retrieve the image corresponding to the given text query from a pool of candidate images.
We propose a CLIP-driven Fine-grained information excavation framework (CFine) to fully utilize the powerful knowledge of CLIP for TIReID.
arXiv Detail & Related papers (2022-10-19T03:43:12Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.