Large-Scale Unsupervised Person Re-Identification with Contrastive
Learning
- URL: http://arxiv.org/abs/2105.07914v1
- Date: Mon, 17 May 2021 14:55:08 GMT
- Title: Large-Scale Unsupervised Person Re-Identification with Contrastive
Learning
- Authors: Weiquan Huang, Yan Bai, Qiuyu Ren, Xinbo Zhao, Ming Feng and Yin Wang
- Abstract summary: Most existing unsupervised and domain adaptation ReID methods utilize only the public datasets in their experiments.
Inspired by the recent progress of large-scale self-supervised image classification using contrastive learning, we propose to learn ReID representation from large-scale unlabeled surveillance video alone.
- Score: 17.04597303816259
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing public person Re-Identification~(ReID) datasets are small in modern
terms because of labeling difficulty. Although unlabeled surveillance video is
abundant and relatively easy to obtain, it is unclear how to leverage these
footage to learn meaningful ReID representations. In particular, most existing
unsupervised and domain adaptation ReID methods utilize only the public
datasets in their experiments, with labels removed. In addition, due to small
data sizes, these methods usually rely on fine tuning by the unlabeled training
data in the testing domain to achieve good performance. Inspired by the recent
progress of large-scale self-supervised image classification using contrastive
learning, we propose to learn ReID representation from large-scale unlabeled
surveillance video alone. Assisted by off-the-shelf pedestrian detection tools,
we apply the contrastive loss at both the image and the tracklet levels.
Together with a principal component analysis step using camera labels freely
available, our evaluation using a large-scale unlabeled dataset shows far
superior performance among unsupervised methods that do not use any training
data in the testing domain. Furthermore, the accuracy improves with the data
size and therefore our method has great potential with even larger and more
diversified datasets.
Related papers
- Weakly Semi-supervised Tool Detection in Minimally Invasive Surgery
Videos [11.61305113932032]
Surgical tool detection is essential for analyzing and evaluating minimally invasive surgery videos.
Large image datasets with instance-level labels are often limited because of the burden of annotation.
In this work, we propose to strike a balance between the extremely costly annotation burden and detection performance.
arXiv Detail & Related papers (2024-01-05T13:05:02Z) - Domain Adaptive Multiple Instance Learning for Instance-level Prediction
of Pathological Images [45.132775668689604]
We propose a new task setting to improve the classification performance of the target dataset without increasing annotation costs.
In order to combine the supervisory information of both methods effectively, we propose a method to create pseudo-labels with high confidence.
arXiv Detail & Related papers (2023-04-07T08:31:06Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Semi-weakly Supervised Contrastive Representation Learning for Retinal
Fundus Images [0.2538209532048867]
We propose a semi-weakly supervised contrastive learning framework for representation learning using semi-weakly annotated images.
We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets.
arXiv Detail & Related papers (2021-08-04T15:50:09Z) - Unsupervised Noisy Tracklet Person Re-identification [100.85530419892333]
We present a novel selective tracklet learning (STL) approach that can train discriminative person re-id models from unlabelled tracklet data.
This avoids the tedious and costly process of exhaustively labelling person image/tracklet true matching pairs across camera views.
Our method is particularly more robust against arbitrary noisy data of raw tracklets therefore scalable to learning discriminative models from unconstrained tracking data.
arXiv Detail & Related papers (2021-01-16T07:31:00Z) - Labelling unlabelled videos from scratch with multi-modal
self-supervision [82.60652426371936]
unsupervised labelling of a video dataset does not come for free from strong feature encoders.
We propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations.
An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels.
arXiv Detail & Related papers (2020-06-24T12:28:17Z) - Don't Wait, Just Weight: Improving Unsupervised Representations by
Learning Goal-Driven Instance Weights [92.16372657233394]
Self-supervised learning techniques can boost performance by learning useful representations from unlabelled data.
We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy.
Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon.
arXiv Detail & Related papers (2020-06-22T15:59:32Z) - Laplacian Denoising Autoencoder [114.21219514831343]
We propose to learn data representations with a novel type of denoising autoencoder.
The noisy input data is generated by corrupting latent clean data in the gradient domain.
Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach.
arXiv Detail & Related papers (2020-03-30T16:52:39Z) - Evolving Losses for Unsupervised Video Representation Learning [91.2683362199263]
We present a new method to learn video representations from large-scale unlabeled video data.
The proposed unsupervised representation learning results in a single RGB network and outperforms previous methods.
arXiv Detail & Related papers (2020-02-26T16:56:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.