Camera Alignment and Weighted Contrastive Learning for Domain Adaptation
in Video Person ReID
- URL: http://arxiv.org/abs/2211.03626v1
- Date: Mon, 7 Nov 2022 15:32:56 GMT
- Title: Camera Alignment and Weighted Contrastive Learning for Domain Adaptation
in Video Person ReID
- Authors: Djebril Mekhazni, Maximilien Dufau, Christian Desrosiers, Marco
Pedersoli, Eric Granger
- Abstract summary: Systems for person re-identification (ReID) can achieve a high accuracy when trained on large fully-labeled image datasets.
The domain shift associated with diverse operational capture conditions (e.g., camera viewpoints and lighting) may translate to a significant decline in performance.
This paper focuses on unsupervised domain adaptation (UDA) for video-based ReID.
- Score: 17.90248359024435
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Systems for person re-identification (ReID) can achieve a high accuracy when
trained on large fully-labeled image datasets. However, the domain shift
typically associated with diverse operational capture conditions (e.g., camera
viewpoints and lighting) may translate to a significant decline in performance.
This paper focuses on unsupervised domain adaptation (UDA) for video-based ReID
- a relevant scenario that is less explored in the literature. In this
scenario, the ReID model must adapt to a complex target domain defined by a
network of diverse video cameras based on tracklet information. State-of-art
methods cluster unlabeled target data, yet domain shifts across target cameras
(sub-domains) can lead to poor initialization of clustering methods that
propagates noise across epochs, thus preventing the ReID model to accurately
associate samples of same identity. In this paper, an UDA method is introduced
for video person ReID that leverages knowledge on video tracklets, and on the
distribution of frames captured over target cameras to improve the performance
of CNN backbones trained using pseudo-labels. Our method relies on an
adversarial approach, where a camera-discriminator network is introduced to
extract discriminant camera-independent representations, facilitating the
subsequent clustering. In addition, a weighted contrastive loss is proposed to
leverage the confidence of clusters, and mitigate the risk of incorrect
identity associations. Experimental results obtained on three challenging
video-based person ReID datasets - PRID2011, iLIDS-VID, and MARS - indicate
that our proposed method can outperform related state-of-the-art methods. Our
code is available at: \url{https://github.com/dmekhazni/CAWCL-ReID}
Related papers
- Adaptive Face Recognition Using Adversarial Information Network [57.29464116557734]
Face recognition models often degenerate when training data are different from testing data.
We propose a novel adversarial information network (AIN) to address it.
arXiv Detail & Related papers (2023-05-23T02:14:11Z) - Simplifying Open-Set Video Domain Adaptation with Contrastive Learning [16.72734794723157]
unsupervised video domain adaptation methods have been proposed to adapt a predictive model from a labelled dataset to an unlabelled dataset.
We address a more realistic scenario, called open-set video domain adaptation (OUVDA), where the target dataset contains "unknown" semantic categories that are not shared with the source.
We propose a video-oriented temporal contrastive loss that enables our method to better cluster the feature space by exploiting the freely available temporal information in video data.
arXiv Detail & Related papers (2023-01-09T13:16:50Z) - Camera-Tracklet-Aware Contrastive Learning for Unsupervised Vehicle
Re-Identification [4.5471611558189124]
We propose camera-tracklet-aware contrastive learning (CTACL) using the multi-camera tracklet information without vehicle identity labels.
The proposed CTACL divides an unlabelled domain, i.e., entire vehicle images, into multiple camera-level images and conducts contrastive learning.
We demonstrate the effectiveness of our approach on video-based and image-based vehicle Re-ID datasets.
arXiv Detail & Related papers (2021-09-14T02:12:54Z) - Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for
Unsupervised Person Re-Identification [60.36551512902312]
unsupervised person re-identification (re-ID) aims to learn discriminative models with unlabeled data.
One popular method is to obtain pseudo-label by clustering and use them to optimize the model.
In this paper, we propose a unified framework to solve both problems.
arXiv Detail & Related papers (2021-03-08T09:13:06Z) - Learn by Guessing: Multi-Step Pseudo-Label Refinement for Person
Re-Identification [0.0]
A promising approach relies on the use of unsupervised learning as part of the pipeline.
In this work, we propose a multi-step pseudo-label refinement method to select the best possible clusters.
We surpass state-of-the-art for UDA Re-ID by 3.4% on Market1501-DukeMTMC datasets.
arXiv Detail & Related papers (2021-01-04T20:00:33Z) - Camera-aware Proxies for Unsupervised Person Re-Identification [60.26031011794513]
This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations.
We propose to split each single cluster into multiple proxies and each proxy represents the instances coming from the same camera.
Based on the camera-aware proxies, we design both intra- and inter-camera contrastive learning components for our Re-ID model.
arXiv Detail & Related papers (2020-12-19T12:37:04Z) - Identity-Aware Attribute Recognition via Real-Time Distributed Inference
in Mobile Edge Clouds [53.07042574352251]
We design novel models for pedestrian attribute recognition with re-ID in an MEC-enabled camera monitoring system.
We propose a novel inference framework with a set of distributed modules, by jointly considering the attribute recognition and person re-ID.
We then devise a learning-based algorithm for the distributions of the modules of the proposed distributed inference framework.
arXiv Detail & Related papers (2020-08-12T12:03:27Z) - Unsupervised Learning of Video Representations via Dense Trajectory
Clustering [86.45054867170795]
This paper addresses the task of unsupervised learning of representations for action recognition in videos.
We first propose to adapt two top performing objectives in this class - instance recognition and local aggregation.
We observe promising performance, but qualitative analysis shows that the learned representations fail to capture motion patterns.
arXiv Detail & Related papers (2020-06-28T22:23:03Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z) - Dual-Triplet Metric Learning for Unsupervised Domain Adaptation in
Video-Based Face Recognition [8.220945563455848]
A new deep domain adaptation (DA) method is proposed to adapt the CNN embedding of a Siamese network using unlabeled tracklets captured with a new video cameras.
The proposed metric learning technique is used to train deep Siamese networks under different training scenarios.
arXiv Detail & Related papers (2020-02-11T05:06:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.