Multi-view Information Integration and Propagation for Occluded Person
Re-identification
- URL: http://arxiv.org/abs/2311.03828v3
- Date: Thu, 14 Dec 2023 04:34:37 GMT
- Title: Multi-view Information Integration and Propagation for Occluded Person
Re-identification
- Authors: Neng Dong, Shuanglin Yan, Hao Tang, Jinhui Tang, Liyan Zhang
- Abstract summary: Occluded person re-identification (re-ID) presents a challenging task due to occlusion perturbations.
Most current solutions only capture information from a single image, disregarding the rich complementary information available in multiple images depicting the same pedestrian.
We propose a novel framework called Multi-view Information Integration and Propagation (MVI$2$P)
- Score: 36.91680117072686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Occluded person re-identification (re-ID) presents a challenging task due to
occlusion perturbations. Although great efforts have been made to prevent the
model from being disturbed by occlusion noise, most current solutions only
capture information from a single image, disregarding the rich complementary
information available in multiple images depicting the same pedestrian. In this
paper, we propose a novel framework called Multi-view Information Integration
and Propagation (MVI$^{2}$P). Specifically, realizing the potential of
multi-view images in effectively characterizing the occluded target pedestrian,
we integrate feature maps of which to create a comprehensive representation.
During this process, to avoid introducing occlusion noise, we develop a
CAMs-aware Localization module that selectively integrates information
contributing to the identification. Additionally, considering the divergence in
the discriminative nature of different images, we design a probability-aware
Quantification module to emphatically integrate highly reliable information.
Moreover, as multiple images with the same identity are not accessible in the
testing stage, we devise an Information Propagation (IP) mechanism to distill
knowledge from the comprehensive representation to that of a single occluded
image. Extensive experiments and analyses have unequivocally demonstrated the
effectiveness and superiority of the proposed MVI$^{2}$P. The code will be
released at \url{https://github.com/nengdong96/MVIIP}.
Related papers
- Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm [31.06269858216316]
We propose Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization.
We introduce an identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information.
We also introduce a feature interaction mechanism that combines a mixed attention module with an AdaIN-mean operation to seamlessly merge the two streams.
arXiv Detail & Related papers (2024-03-18T13:39:53Z) - OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models [47.63060402915307]
OMG is a framework designed to seamlessly integrate multiple concepts within a single image.
OMG exhibits superior performance in multi-concept personalization.
LoRA models on civitai.com can be exploited directly.
arXiv Detail & Related papers (2024-03-16T17:30:15Z) - Dynamic Patch-aware Enrichment Transformer for Occluded Person
Re-Identification [14.219232629274186]
We present an end-to-end solution known as the Dynamic Patch-aware Enrichment Transformer (DPEFormer)
This model effectively distinguishes human body information from occlusions automatically and dynamically.
To ensure that DPSM and the entire DPEFormer can effectively learn with only identity labels, we also propose a Realistic Occlusion Augmentation (ROA) strategy.
arXiv Detail & Related papers (2024-02-16T03:53:30Z) - Unified Multi-Modal Image Synthesis for Missing Modality Imputation [23.681228202899984]
We propose a novel unified multi-modal image synthesis method for missing modality imputation.
The proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.
arXiv Detail & Related papers (2023-04-11T16:59:15Z) - Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person
Re-identification [78.08536797239893]
We propose a novel Multi-Stage Spatial-Temporal Aggregation Transformer (MSTAT) with two novel designed proxy embedding modules.
MSTAT consists of three stages to encode the attribute-associated, the identity-associated, and the attribute-identity-associated information from the video clips.
We show that MSTAT can achieve state-of-the-art accuracies on various standard benchmarks.
arXiv Detail & Related papers (2023-01-02T05:17:31Z) - Occluded Person Re-Identification via Relational Adaptive Feature
Correction Learning [8.015703163954639]
Occluded person re-identification (Re-ID) in images captured by multiple cameras is challenging because the target person is occluded by pedestrians or objects.
Most existing methods utilize the off-the-shelf pose or parsing networks as pseudo labels, which are prone to error.
We propose a novel Occlusion Correction Network (OCNet) that corrects features through relational-weight learning and obtains diverse and representative features without using external networks.
arXiv Detail & Related papers (2022-12-09T07:48:47Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Fully Unsupervised Diversity Denoising with Convolutional Variational
Autoencoders [81.30960319178725]
We propose DivNoising, a denoising approach based on fully convolutional variational autoencoders (VAEs)
First we introduce a principled way of formulating the unsupervised denoising problem within the VAE framework by explicitly incorporating imaging noise models into the decoder.
We show that such a noise model can either be measured, bootstrapped from noisy data, or co-learned during training.
arXiv Detail & Related papers (2020-06-10T21:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.