Object Re-identification via Spatial-temporal Fusion Networks and Causal Identity Matching
- URL: http://arxiv.org/abs/2408.05558v2
- Date: Thu, 22 Aug 2024 11:25:31 GMT
- Title: Object Re-identification via Spatial-temporal Fusion Networks and Causal Identity Matching
- Authors: Hye-Geun Kim, Yong-Hyuk Moon, Yeong-Jun Cho,
- Abstract summary: We introduce a novel ReID framework that leverages a spatial-temporal fusion network and causal identity matching (CIM)
Our framework estimates camera network topology using a proposed adaptive Parzen window and combines appearance features with spatial-temporal cues within the fusion network.
This approach has demonstrated outstanding performance across several datasets, including VeRi776, Vehicle-3I, and Market-1501, achieving up to 99.70% rank-1 accuracy and 95.5% mAP.
- Score: 4.123763595394021
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object re-identification (ReID) in large camera networks faces numerous challenges. First, the similar appearances of objects degrade ReID performance, a challenge that needs to be addressed by existing appearance-based ReID methods. Second, most ReID studies are performed in laboratory settings and do not consider real-world scenarios. To overcome these challenges, we introduce a novel ReID framework that leverages a spatial-temporal fusion network and causal identity matching (CIM). Our framework estimates camera network topology using a proposed adaptive Parzen window and combines appearance features with spatial-temporal cues within the fusion network. This approach has demonstrated outstanding performance across several datasets, including VeRi776, Vehicle-3I, and Market-1501, achieving up to 99.70% rank-1 accuracy and 95.5% mAP. Furthermore, the proposed CIM approach, which dynamically assigns gallery sets based on camera network topology, has further improved ReID accuracy and robustness in real-world settings, evidenced by a 94.95% mAP and a 95.19% F1 score on the Vehicle-3I dataset. The experimental results support the effectiveness of incorporating spatial-temporal information and CIM for real-world ReID scenarios, regardless of the data domain (e.g., vehicle, person).
Related papers
- Towards Global Localization using Multi-Modal Object-Instance Re-Identification [23.764646800085977]
We propose a novel re-identification transformer architecture that integrates multimodal RGB and depth information.
We demonstrate improvements in ReID across scenes that are cluttered or have varying illumination conditions.
We also develop a ReID-based localization framework that enables accurate camera localization and pose identification across different viewpoints.
arXiv Detail & Related papers (2024-09-18T14:15:10Z) - Robust Ensemble Person Re-Identification via Orthogonal Fusion with Occlusion Handling [4.431087385310259]
Occlusion remains one of the major challenges in person reidentification (ReID)
We propose a deep ensemble model that harnesses both CNN and Transformer architectures to generate robust feature representations.
arXiv Detail & Related papers (2024-03-29T18:38:59Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Spatial-temporal Vehicle Re-identification [3.7748602100709534]
We propose a spatial-temporal vehicle ReID framework that estimates reliable camera network topology.
Based on the proposed methods, we performed superior performance on the public dataset (VeRi776) by 99.64% of rank-1 accuracy.
arXiv Detail & Related papers (2023-09-03T13:07:38Z) - Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection [54.041049052843604]
We present STEMD, a novel end-to-end framework that enhances the DETR-like paradigm for multi-frame 3D object detection.
First, to model the inter-object spatial interaction and complex temporal dependencies, we introduce the spatial-temporal graph attention network.
Finally, it poses a challenge for the network to distinguish between the positive query and other highly similar queries that are not the best match.
arXiv Detail & Related papers (2023-07-01T13:53:14Z) - Object Re-Identification from Point Clouds [3.6308236424346694]
We provide the first large-scale study of object ReID from point clouds and establish its performance relative to image ReID.
To our knowledge, we are the first to study object re-identification from real point cloud observations.
arXiv Detail & Related papers (2023-05-17T13:43:03Z) - Fusing Local Similarities for Retrieval-based 3D Orientation Estimation
of Unseen Objects [70.49392581592089]
We tackle the task of estimating the 3D orientation of previously-unseen objects from monocular images.
We follow a retrieval-based strategy and prevent the network from learning object-specific features.
Our experiments on the LineMOD, LineMOD-Occluded, and T-LESS datasets show that our method yields a significantly better generalization to unseen objects than previous works.
arXiv Detail & Related papers (2022-03-16T08:53:00Z) - PSE-Match: A Viewpoint-free Place Recognition Method with Parallel
Semantic Embedding [9.265785042748158]
PSE-Match is a viewpoint-free place recognition method based on parallel semantic analysis of isolated semantic attributes from 3D point-cloud models.
PSE-Match incorporates a divergence place learning network to capture different semantic attributes parallelly through the spherical harmonics domain.
arXiv Detail & Related papers (2021-08-01T22:16:40Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Identity-Aware Attribute Recognition via Real-Time Distributed Inference
in Mobile Edge Clouds [53.07042574352251]
We design novel models for pedestrian attribute recognition with re-ID in an MEC-enabled camera monitoring system.
We propose a novel inference framework with a set of distributed modules, by jointly considering the attribute recognition and person re-ID.
We then devise a learning-based algorithm for the distributions of the modules of the proposed distributed inference framework.
arXiv Detail & Related papers (2020-08-12T12:03:27Z) - Dynamic Refinement Network for Oriented and Densely Packed Object
Detection [75.29088991850958]
We present a dynamic refinement network that consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH)
Our FSM enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, whereas the DRH empowers our model to refine the prediction dynamically in an object-aware manner.
We perform quantitative evaluations on several publicly available benchmarks including DOTA, HRSC2016, SKU110K, and our own SKU110K-R dataset.
arXiv Detail & Related papers (2020-05-20T11:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.