Unsupervised Person Re-Identification with Multi-Label Learning Guided
Self-Paced Clustering
- URL: http://arxiv.org/abs/2103.04580v1
- Date: Mon, 8 Mar 2021 07:30:13 GMT
- Title: Unsupervised Person Re-Identification with Multi-Label Learning Guided
Self-Paced Clustering
- Authors: Qing Li, Xiaojiang Peng, Yu Qiao, Qi Hao
- Abstract summary: Unsupervised person re-identification (Re-ID) has drawn increasing research attention recently.
In this paper, we address the unsupervised person Re-ID with a conceptually novel yet simple framework, termed as Multi-label Learning guided self-paced Clustering (MLC)
MLC mainly learns discriminative features with three crucial modules, namely a multi-scale network, a multi-label learning module, and a self-paced clustering module.
- Score: 48.31017226618255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although unsupervised person re-identification (Re-ID) has drawn increasing
research attention recently, it remains challenging to learn discriminative
features without annotations across disjoint camera views. In this paper, we
address the unsupervised person Re-ID with a conceptually novel yet simple
framework, termed as Multi-label Learning guided self-paced Clustering (MLC).
MLC mainly learns discriminative features with three crucial modules, namely a
multi-scale network, a multi-label learning module, and a self-paced clustering
module. Specifically, the multi-scale network generates multi-granularity
person features in both global and local views. The multi-label learning module
leverages a memory feature bank and assigns each image with a multi-label
vector based on the similarities between the image and feature bank. After
multi-label training for several epochs, the self-paced clustering joins in
training and assigns a pseudo label for each image. The benefits of our MLC
come from three aspects: i) the multi-scale person features for better
similarity measurement, ii) the multi-label assignment based on the whole
dataset ensures that every image can be trained, and iii) the self-paced
clustering removes some noisy samples for better feature learning. Extensive
experiments on three popular large-scale Re-ID benchmarks demonstrate that our
MLC outperforms previous state-of-the-art methods and significantly improves
the performance of unsupervised person Re-ID.
Related papers
- Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks.
MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization.
MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z) - Multi-View Factorizing and Disentangling: A Novel Framework for Incomplete Multi-View Multi-Label Classification [9.905528765058541]
We propose a novel framework for incomplete multi-view multi-label classification (iMvMLC)
Our method factorizes multi-view representations into two independent sets of factors: view-consistent and view-specific.
Our framework innovatively decomposes consistent representation learning into three key sub-objectives.
arXiv Detail & Related papers (2025-01-11T12:19:20Z) - Towards Generalized Multi-stage Clustering: Multi-view Self-distillation [10.368796552760571]
Existing multi-stage clustering methods independently learn the salient features from multiple views and then perform the clustering task.
This paper proposes a novel multi-stage deep MVC framework where multi-view self-distillation (DistilMVC) is introduced to distill dark knowledge of label distribution.
arXiv Detail & Related papers (2023-10-29T03:35:34Z) - Reliable Representations Learning for Incomplete Multi-View Partial Multi-Label Classification [78.15629210659516]
In this paper, we propose an incomplete multi-view partial multi-label classification network named RANK.
We break through the view-level weights inherent in existing methods and propose a quality-aware sub-network to dynamically assign quality scores to each view of each sample.
Our model is not only able to handle complete multi-view multi-label datasets, but also works on datasets with missing instances and labels.
arXiv Detail & Related papers (2023-03-30T03:09:25Z) - Object-Aware Self-supervised Multi-Label Learning [9.496981642855769]
We propose an Object-Aware Self-Supervision (OASS) method to obtain more fine-grained representations for multi-label learning.
The proposed method can be leveraged to efficiently generate Class-Specific Instances (CSI) in a proposal-free fashion.
Experiments on the VOC2012 dataset for multi-label classification demonstrate the effectiveness of the proposed method against the state-of-the-art counterparts.
arXiv Detail & Related papers (2022-05-14T10:14:08Z) - Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and
Semi-Supervised Semantic Segmentation [119.009033745244]
This paper presents a Self-supervised Low-Rank Network ( SLRNet) for single-stage weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS)
SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several attentive LR representations from different views of an image to learn precise pseudo-labels.
Experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings.
arXiv Detail & Related papers (2022-03-19T09:19:55Z) - Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition.
We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction.
We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z) - Self-supervised Discriminative Feature Learning for Multi-view
Clustering [12.725701189049403]
We propose self-supervised discriminative feature learning for multi-view clustering (SDMVC)
Concretely, deep autoencoders are applied to learn embedded features for each view independently.
Experiments on various types of multi-view datasets show that SDMVC achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-03-28T07:18:39Z) - Spatial-Temporal Multi-Cue Network for Continuous Sign Language
Recognition [141.24314054768922]
We propose a spatial-temporal multi-cue (STMC) network to solve the vision-based sequence learning problem.
To validate the effectiveness, we perform experiments on three large-scale CSLR benchmarks.
arXiv Detail & Related papers (2020-02-08T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.