Looking Twice for Partial Clues: Weakly-supervised Part-Mentored
Attention Network for Vehicle Re-Identification
- URL: http://arxiv.org/abs/2107.08228v1
- Date: Sat, 17 Jul 2021 12:19:12 GMT
- Title: Looking Twice for Partial Clues: Weakly-supervised Part-Mentored
Attention Network for Vehicle Re-Identification
- Authors: Lisha Tang, Yi Wang, Lap-Pui Chau
- Abstract summary: Part-Mentored Attention Network (PMANet) for vehicle part localization with self-attention and a Part-Mentored Network (PMNet) for mentoring the global and local feature aggregation.
Our approach outperforms recent state-of-the-art methods by averagely 2.63% in CMC@1 on VehicleID and 2.2% in mAP on VeRi776.
- Score: 18.539658212171062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vehicle re-identification (Re-ID) is to retrieve images of the same vehicle
across different cameras. Two key challenges lie in the subtle inter-instance
discrepancy caused by near-duplicate identities and the large intra-instance
variance caused by different views. Since the holistic appearance suffers from
viewpoint variation and distortion, part-level feature learning has been
introduced to enhance vehicle description. However, existing approaches to
localize and amplify significant parts often fail to handle spatial
misalignment as well as occlusion and require expensive annotations. In this
paper, we propose a weakly supervised Part-Mentored Attention Network (PMANet)
composed of a Part Attention Network (PANet) for vehicle part localization with
self-attention and a Part-Mentored Network (PMNet) for mentoring the global and
local feature aggregation. Firstly, PANet is introduced to predict a foreground
mask and pinpoint $K$ prominent vehicle parts only with weak identity
supervision. Secondly, we propose a PMNet to learn global and part-level
features with multi-scale attention and aggregate them in $K$ main-partial
tasks via part transfer. Like humans who first differentiate objects with
general information and then observe salient parts for more detailed clues,
PANet and PMNet construct a two-stage attention structure to perform a
coarse-to-fine search among identities. Finally, we address this Re-ID issue as
a multi-task problem, including global feature learning, identity
classification, and part transfer. We adopt Homoscedastic Uncertainty to learn
the optimal weighing of different losses. Comprehensive experiments are
conducted on two benchmark datasets. Our approach outperforms recent
state-of-the-art methods by averagely 2.63% in CMC@1 on VehicleID and 2.2% in
mAP on VeRi776. Results on occluded test sets also demonstrate the
generalization ability of PMANet.
Related papers
- AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition [51.24321348668037]
Panoramic Activity Recognition (PAR) aims to identify multi-granularity behaviors performed by multiple persons in panoramic scenes.
Previous methods rely on manually annotated detection boxes in training and inference, hindering further practical deployment.
We propose a novel Adapt-Focused bi-Propagating Prototype learning (AdaFPP) framework to jointly recognize individual, group, and global activities in panoramic activity scenes.
arXiv Detail & Related papers (2024-05-04T01:53:22Z) - V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric
Heterogenous Distillation Network [13.248981195106069]
We propose a multi-view vehicle-road cooperation perception system, vehicle-to-everything cooperative perception (V2X-AHD)
The V2X-AHD can effectively improve the accuracy of 3D object detection and reduce the number of network parameters, according to this study.
arXiv Detail & Related papers (2023-10-10T13:12:03Z) - Learning Cross-modality Information Bottleneck Representation for
Heterogeneous Person Re-Identification [61.49219876388174]
Visible-Infrared person re-identification (VI-ReID) is an important and challenging task in intelligent video surveillance.
Existing methods mainly focus on learning a shared feature space to reduce the modality discrepancy between visible and infrared modalities.
We present a novel mutual information and modality consensus network, namely CMInfoNet, to extract modality-invariant identity features.
arXiv Detail & Related papers (2023-08-29T06:55:42Z) - Multi-query Vehicle Re-identification: Viewpoint-conditioned Network,
Unified Dataset and New Metric [30.344288906037345]
We propose a more realistic and easily accessible task, called multi-query vehicle Re-ID.
We design a novel viewpoint-conditioned network (VCNet), which adaptively combines the complementary information from different vehicle viewpoints.
Second, we create a unified benchmark dataset, taken by 6142 cameras from a real-life transportation surveillance system.
Third, we design a new evaluation metric, called mean cross-scene precision (mCSP), which measures the ability of cross-scene recognition.
arXiv Detail & Related papers (2023-05-25T06:22:03Z) - Domain Camera Adaptation and Collaborative Multiple Feature Clustering for Unsupervised Person Re-ID [5.212394574743209]
Unsupervised person re-identification (re-ID) has drawn much attention due to its open-world scenario settings where limited data is available.
Existing supervised methods often fail to generalize well on unseen domains, while the unsupervised methods, mostly lack multi-granularity information and are prone to suffer from confirmation bias.
In this paper, we aim at finding better feature representations on the unseen target domain from two aspects, 1) performing unsupervised domain adaptation on the labeled source domain and 2) mining potential similarities on the unlabeled target domain.
arXiv Detail & Related papers (2022-08-18T03:56:48Z) - Towards Generalizable Person Re-identification with a Bi-stream
Generative Model [81.0989316825134]
We propose a Bi-stream Generative Model (BGM) to learn the fine-grained representations fused with camera-invariant global feature and pedestrian-aligned local feature.
Our method outperforms the state-of-the-art methods on the large-scale generalizable re-ID benchmarks.
arXiv Detail & Related papers (2022-06-19T09:18:25Z) - Quality-aware Part Models for Occluded Person Re-identification [77.24920810798505]
Occlusion poses a major challenge for person re-identification (ReID)
Existing approaches typically rely on outside tools to infer visible body parts, which may be suboptimal in terms of both computational efficiency and ReID accuracy.
We propose a novel method named Quality-aware Part Models (QPM) for occlusion-robust ReID.
arXiv Detail & Related papers (2022-01-01T03:51:09Z) - Discovering Discriminative Geometric Features with Self-Supervised
Attention for Vehicle Re-Identification and Beyond [23.233398760777494]
em first to successfully learn discriminative geometric features for vehicle ReID based on self-supervised attention.
We implement an end-to-end trainable deep network architecture consisting of three branches.
We conduct comprehensive experiments on three benchmark datasets for vehicle ReID, ie VeRi-776, CityFlow-ReID, and VehicleID, and demonstrate our state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T04:43:56Z) - Parsing-based View-aware Embedding Network for Vehicle Re-Identification [138.11983486734576]
We propose a parsing-based view-aware embedding network (PVEN) to achieve the view-aware feature alignment and enhancement for vehicle ReID.
The experiments conducted on three datasets show that our model outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-04-10T13:06:09Z) - Generative Partial Multi-View Clustering [133.36721417531734]
We propose a generative partial multi-view clustering model, named as GP-MVC, to address the incomplete multi-view problem.
First, multi-view encoder networks are trained to learn common low-dimensional representations, followed by a clustering layer to capture the consistent cluster structure across multiple views.
Second, view-specific generative adversarial networks are developed to generate the missing data of one view conditioning on the shared representation given by other views.
arXiv Detail & Related papers (2020-03-29T17:48:27Z) - Deep Attention Aware Feature Learning for Person Re-Identification [22.107332426681072]
We propose to incorporate the attention learning as additional objectives in a person ReID network without changing the original structure.
We have tested its performance on two typical networks (TriNet and Bag of Tricks) and observed significant performance improvement on five widely used datasets.
arXiv Detail & Related papers (2020-03-01T16:27:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.