Discovering Discriminative Geometric Features with Self-Supervised
Attention for Vehicle Re-Identification and Beyond
- URL: http://arxiv.org/abs/2010.09221v2
- Date: Tue, 19 Jan 2021 06:26:52 GMT
- Title: Discovering Discriminative Geometric Features with Self-Supervised
Attention for Vehicle Re-Identification and Beyond
- Authors: Ming Li, Xinming Huang, Ziming Zhang
- Abstract summary: em first to successfully learn discriminative geometric features for vehicle ReID based on self-supervised attention.
We implement an end-to-end trainable deep network architecture consisting of three branches.
We conduct comprehensive experiments on three benchmark datasets for vehicle ReID, ie VeRi-776, CityFlow-ReID, and VehicleID, and demonstrate our state-of-the-art performance.
- Score: 23.233398760777494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the literature of vehicle re-identification (ReID), intensive manual
labels such as landmarks, critical parts or semantic segmentation masks are
often required to improve the performance. Such extra information helps to
detect locally geometric features as a part of representation learning for
vehicles. In contrast, in this paper, we aim to address the challenge of {\em
automatically} learning to detect geometric features as landmarks {\em with no
extra labels}. To the best of our knowledge, we are the {\em first} to
successfully learn discriminative geometric features for vehicle ReID based on
self-supervised attention. Specifically, we implement an end-to-end trainable
deep network architecture consisting of three branches: (1) a global branch as
backbone for image feature extraction, (2) an attentional branch for producing
attention masks, and (3) a self-supervised branch for regularizing the
attention learning with rotated images to locate geometric features. %Our
network design naturally leads to an end-to-end multi-task joint optimization.
We conduct comprehensive experiments on three benchmark datasets for vehicle
ReID, \ie VeRi-776, CityFlow-ReID, and VehicleID, and demonstrate our
state-of-the-art performance. %of our approach with the capability of capturing
informative vehicle parts with no corresponding manual labels. We also show the
good generalization of our approach in other ReID tasks such as person ReID and
multi-target multi-camera (MTMC) vehicle tracking. {\em Our demo code is
attached in the supplementary file.}
Related papers
- Multi-query Vehicle Re-identification: Viewpoint-conditioned Network,
Unified Dataset and New Metric [30.344288906037345]
We propose a more realistic and easily accessible task, called multi-query vehicle Re-ID.
We design a novel viewpoint-conditioned network (VCNet), which adaptively combines the complementary information from different vehicle viewpoints.
Second, we create a unified benchmark dataset, taken by 6142 cameras from a real-life transportation surveillance system.
Third, we design a new evaluation metric, called mean cross-scene precision (mCSP), which measures the ability of cross-scene recognition.
arXiv Detail & Related papers (2023-05-25T06:22:03Z) - ConMAE: Contour Guided MAE for Unsupervised Vehicle Re-Identification [8.950873153831735]
This work designs a Contour Guided Masked Autoencoder for Unsupervised Vehicle Re-Identification (ConMAE)
Considering that Masked Autoencoder (MAE) has shown excellent performance in self-supervised learning, this work designs a Contour Guided Masked Autoencoder for Unsupervised Vehicle Re-Identification (ConMAE)
arXiv Detail & Related papers (2023-02-11T12:10:25Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - JPerceiver: Joint Perception Network for Depth, Pose and Layout
Estimation in Driving Scenes [75.20435924081585]
JPerceiver can simultaneously estimate scale-aware depth and VO as well as BEV layout from a monocular video sequence.
It exploits the cross-view geometric transformation (CGT) to propagate the absolute scale from the road layout to depth and VO.
Experiments on Argoverse, Nuscenes and KITTI show the superiority of JPerceiver over existing methods on all the above three tasks.
arXiv Detail & Related papers (2022-07-16T10:33:59Z) - Looking Twice for Partial Clues: Weakly-supervised Part-Mentored
Attention Network for Vehicle Re-Identification [18.539658212171062]
Part-Mentored Attention Network (PMANet) for vehicle part localization with self-attention and a Part-Mentored Network (PMNet) for mentoring the global and local feature aggregation.
Our approach outperforms recent state-of-the-art methods by averagely 2.63% in CMC@1 on VehicleID and 2.2% in mAP on VeRi776.
arXiv Detail & Related papers (2021-07-17T12:19:12Z) - Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle
Re-Identification [53.6218051770131]
Cross-view consistent feature representation is key for accurate vehicle ReID.
Existing approaches resort to supervised cross-view learning using extensive extra viewpoints annotations.
We present a pluggable Weakly-supervised Cross-View Learning (WCVL) module for vehicle ReID.
arXiv Detail & Related papers (2021-03-09T11:51:09Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - Orientation-aware Vehicle Re-identification with Semantics-guided Part
Attention Network [33.712450134663236]
We propose a dedicated Semantics-guided Part Attention Network (SPAN) to robustly predict part attention masks for different views of vehicles.
With the help of part attention masks, we can extract discriminative features in each part separately.
Then we introduce Co-occurrence Part-attentive Distance Metric (CPDM) which places greater emphasis on co-occurrence vehicle parts.
arXiv Detail & Related papers (2020-08-26T07:33:09Z) - Parsing-based View-aware Embedding Network for Vehicle Re-Identification [138.11983486734576]
We propose a parsing-based view-aware embedding network (PVEN) to achieve the view-aware feature alignment and enhancement for vehicle ReID.
The experiments conducted on three datasets show that our model outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-04-10T13:06:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.