Multi-modal Multi-platform Person Re-Identification: Benchmark and Method
- URL: http://arxiv.org/abs/2503.17096v2
- Date: Mon, 24 Mar 2025 03:49:35 GMT
- Title: Multi-modal Multi-platform Person Re-Identification: Benchmark and Method
- Authors: Ruiyang Ha, Songyi Jiang, Bin Li, Bikang Pan, Yihang Zhu, Junjie Zhang, Xiatian Zhu, Shaogang Gong, Jingya Wang,
- Abstract summary: MP-ReID is a novel dataset designed specifically for multi-modality and multi-platform ReID.<n>This benchmark compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging.<n>We introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios.
- Score: 58.59888754340054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional person re-identification (ReID) research is often limited to single-modality sensor data from static cameras, which fails to address the complexities of real-world scenarios where multi-modal signals are increasingly prevalent. For instance, consider an urban ReID system integrating stationary RGB cameras, nighttime infrared sensors, and UAVs equipped with dynamic tracking capabilities. Such systems face significant challenges due to variations in camera perspectives, lighting conditions, and sensor modalities, hindering effective person ReID. To address these challenges, we introduce the MP-ReID benchmark, a novel dataset designed specifically for multi-modality and multi-platform ReID. This benchmark uniquely compiles data from 1,930 identities across diverse modalities, including RGB, infrared, and thermal imaging, captured by both UAVs and ground-based cameras in indoor and outdoor environments. Building on this benchmark, we introduce Uni-Prompt ReID, a framework with specific-designed prompts, tailored for cross-modality and cross-platform scenarios. Our method consistently outperforms state-of-the-art approaches, establishing a robust foundation for future research in complex and dynamic ReID environments. Our dataset are available at:https://mp-reid.github.io/.
Related papers
- Towards Global Localization using Multi-Modal Object-Instance Re-Identification [23.764646800085977]
We propose a novel re-identification transformer architecture that integrates multimodal RGB and depth information.
We demonstrate improvements in ReID across scenes that are cluttered or have varying illumination conditions.
We also develop a ReID-based localization framework that enables accurate camera localization and pose identification across different viewpoints.
arXiv Detail & Related papers (2024-09-18T14:15:10Z) - MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark [63.878793340338035]
Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras.
Existing datasets for this task are either synthetically generated or artificially constructed within a controlled camera network setting.
We present MTMMC, a real-world, large-scale dataset that includes long video sequences captured by 16 multi-modal cameras in two different environments.
arXiv Detail & Related papers (2024-03-29T15:08:37Z) - An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification [58.5877965612088]
Person re-identification (ReID) has made great strides thanks to the data-driven deep learning techniques.
The existing benchmark datasets lack diversity, and models trained on these data cannot generalize well to dynamic wild scenarios.
We develop a new Open-World, Diverse, Cross-Spatial-Temporal dataset named OWD with several distinct features.
arXiv Detail & Related papers (2024-03-22T11:21:51Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
Cross-modality person re-identification (ReID) systems are based on RGB images.<n>Main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.<n>Existing attack methods have primarily focused on the characteristics of the visible image modality.<n>This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Bi-directional Adapter for Multi-modal Tracking [67.01179868400229]
We propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter.
We develop a simple but effective light feature adapter to transfer modality-specific information from one modality to another.
Our model achieves superior tracking performance in comparison with both the full fine-tuning methods and the prompt learning-based methods.
arXiv Detail & Related papers (2023-12-17T05:27:31Z) - LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for
Place Recognition [11.206532393178385]
We present a novel neural network named LCPR for robust multimodal place recognition.
Our method can effectively utilize multi-view camera and LiDAR data to improve the place recognition performance.
arXiv Detail & Related papers (2023-11-06T15:39:48Z) - Dynamic Enhancement Network for Partial Multi-modality Person
Re-identification [52.70235136651996]
We design a novel dynamic enhancement network (DENet), which allows missing arbitrary modalities while maintaining the representation ability of multiple modalities.
Since the missing state might be changeable, we design a dynamic enhancement module, which dynamically enhances modality features according to the missing state in an adaptive manner.
arXiv Detail & Related papers (2023-05-25T06:22:01Z) - Cross Vision-RF Gait Re-identification with Low-cost RGB-D Cameras and
mmWave Radars [15.662787088335618]
This work studies the problem of cross-modal human re-identification (ReID)
We propose the first-of-its-kind vision-RF system for cross-modal multi-person ReID at the same time.
Our proposed system is able to achieve 92.5% top-1 accuracy and 97.5% top-5 accuracy out of 56 volunteers.
arXiv Detail & Related papers (2022-07-16T10:34:25Z) - Cross-Modal Object Tracking: Modality-Aware Representations and A
Unified Benchmark [8.932487291107812]
In many visual systems, visual tracking often bases on RGB image sequences, in which some targets are invalid in low-light conditions.
We propose a new algorithm, which learns the modality-aware target representation to mitigate the appearance gap between RGB and NIR modalities in the tracking process.
We will release the dataset for free academic usage, dataset download link and code will be released soon.
arXiv Detail & Related papers (2021-11-08T03:58:55Z) - Event-based Stereo Visual Odometry [42.77238738150496]
We present a solution to the problem of visual odometry from the data acquired by a stereo event-based camera rig.
We seek to maximize thetemporal consistency of stereo event-based data while using a simple and efficient representation.
arXiv Detail & Related papers (2020-07-30T15:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.