Related papers: Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based Object Re-Identification

Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based Object Re-Identification

URL: http://arxiv.org/abs/2001.05197v2
Date: Tue, 21 Jan 2020 17:21:07 GMT
Title: Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based Object Re-Identification
Authors: Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen
Abstract summary: We propose exploiting the multi-shots of the same identity to guide the feature learning of each individual image. It consists of a teacher network (T-net) that learns the comprehensive features from multiple images of the same object, and a student network (S-net) that takes a single image as input. We validate the effectiveness of our approach on the popular vehicle re-id and person re-id datasets.
Score: 93.39253443415392
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Object re-identification (re-id) aims to identify a specific object across times or camera views, with the person re-id and vehicle re-id as the most widely studied applications. Re-id is challenging because of the variations in viewpoints, (human) poses, and occlusions. Multi-shots of the same object can cover diverse viewpoints/poses and thus provide more comprehensive information. In this paper, we propose exploiting the multi-shots of the same identity to guide the feature learning of each individual image. Specifically, we design an Uncertainty-aware Multi-shot Teacher-Student (UMTS) Network. It consists of a teacher network (T-net) that learns the comprehensive features from multiple images of the same object, and a student network (S-net) that takes a single image as input. In particular, we take into account the data dependent heteroscedastic uncertainty for effectively transferring the knowledge from the T-net to S-net. To the best of our knowledge, we are the first to make use of multi-shots of an object in a teacher-student learning manner for effectively boosting the single image based re-id. We validate the effectiveness of our approach on the popular vehicle re-id and person re-id datasets. In inference, the S-net alone significantly outperforms the baselines and achieves the state-of-the-art performance.

Related papers

Omni-ID: Holistic Identity Representation Designed for Generative Tasks [75.29174595706533]
Omni-ID encodes holistic information about an individual's appearance across diverse expressions. It consolidates information from a varied number of unstructured input images into a structured representation. It demonstrates substantial improvements over conventional representations across various generative tasks.
arXiv Detail & Related papers (2024-12-12T19:21:20Z)
Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-Training [51.87027943520492]
We present a novel paradigm Diffusion-ReID to efficiently augment and generate diverse images based on known identities. Benefiting from our proposed paradigm, we first create a new large-scale person Re-ID dataset Diff-Person, which consists of over 777K images from 5,183 identities.
arXiv Detail & Related papers (2024-06-10T06:26:03Z)
Learning Transferable Pedestrian Representation from Multimodal Information Supervision [174.5150760804929]
VAL-PAT is a novel framework that learns transferable representations to enhance various pedestrian analysis tasks with multimodal information. We first perform pre-training on LUPerson-TA dataset, where each image contains text and attribute annotations. We then transfer the learned representations to various downstream tasks, including person reID, person attribute recognition and text-based person search.
arXiv Detail & Related papers (2023-04-12T01:20:58Z)
Learning Invariance from Generated Variance for Unsupervised Person Re-identification [15.096776375794356]
We propose to replace traditional data augmentation with a generative adversarial network (GAN) A 3D mesh guided person image generator is proposed to disentangle a person image into id-related and id-unrelated features. By jointly training the generative and the contrastive modules, our method achieves new state-of-the-art unsupervised person ReID performance on mainstream large-scale benchmarks.
arXiv Detail & Related papers (2023-01-02T15:40:14Z)
Feature Disentanglement Learning with Switching and Aggregation for Video-based Person Re-Identification [9.068045610800667]
In video person re-identification (Re-ID), the network must consistently extract features of the target person from successive frames. Existing methods tend to focus only on how to use temporal information, which often leads to networks being fooled by similar appearances and same backgrounds. We propose a Disentanglement and Switching and Aggregation Network (DSANet), which segregates the features representing identity and features based on camera characteristics, and pays more attention to ID information.
arXiv Detail & Related papers (2022-12-16T04:27:56Z)
Semantic-Aware Generation for Self-Supervised Visual Representation Learning [116.5814634936371]
We advocate for Semantic-aware Generation (SaGe) to facilitate richer semantics rather than details to be preserved in the generated image. SaGe complements the target network with view-specific features and thus alleviates the semantic degradation brought by intensive data augmentations. We execute SaGe on ImageNet-1K and evaluate the pre-trained models on five downstream tasks including nearest neighbor test, linear classification, and fine-scaled image recognition.
arXiv Detail & Related papers (2021-11-25T16:46:13Z)
Pose-driven Attention-guided Image Generation for Person Re-Identification [39.605062525247135]
We propose an end-to-end pose-driven generative adversarial network to generate multiple poses of a person. A semantic-consistency loss is proposed to preserve the semantic information of the person during pose transfer. We show that by incorporating the proposed approach in a person re-identification framework, realistic pose transferred images and state-of-the-art re-identification results can be achieved.
arXiv Detail & Related papers (2021-04-28T14:02:24Z)
Person image generation with semantic attention network for person re-identification [9.30413920076019]
We propose a novel person pose-guided image generation method, which is called the semantic attention network. The network consists of several semantic attention blocks, where each block attends to preserve and update the pose code and the clothing textures. Compared with other methods, our network can characterize better body shape and keep clothing attributes, simultaneously.
arXiv Detail & Related papers (2020-08-18T12:18:51Z)
Exploit Clues from Views: Self-Supervised and Regularized Learning for Multiview Object Recognition [66.87417785210772]
This work investigates the problem of multiview self-supervised learning (MV-SSL) A novel surrogate task for self-supervised learning is proposed by pursuing "object invariant" representation. Experiments shows that the recognition and retrieval results using view invariant prototype embedding (VISPE) outperform other self-supervised learning methods.
arXiv Detail & Related papers (2020-03-28T07:06:06Z)
Intra-Camera Supervised Person Re-Identification [87.88852321309433]
We propose a novel person re-identification paradigm based on an idea of independent per-camera identity annotation. This eliminates the most time-consuming and tedious inter-camera identity labelling process. We formulate a Multi-tAsk mulTi-labEl (MATE) deep learning method for Intra-Camera Supervised (ICS) person re-id.
arXiv Detail & Related papers (2020-02-12T15:26:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.