PSDiff: Diffusion Model for Person Search with Iterative and
Collaborative Refinement
- URL: http://arxiv.org/abs/2309.11125v2
- Date: Wed, 13 Mar 2024 12:04:18 GMT
- Title: PSDiff: Diffusion Model for Person Search with Iterative and
Collaborative Refinement
- Authors: Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang, and
Jingdong Wang
- Abstract summary: We present a novel Person Search framework based on the Diffusion model, PSDiff.
PSDiff formulates the person search as a dual denoising process from noisy boxes and ReID embeddings to ground truths.
Following the new paradigm, we further design a new Collaborative Denoising Layer (CDL) to optimize detection and ReID sub-tasks in an iterative and collaborative way.
- Score: 59.6260680005195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dominant Person Search methods aim to localize and recognize query persons in
a unified network, which jointly optimizes two sub-tasks, \ie, pedestrian
detection and Re-IDentification (ReID). Despite significant progress, current
methods face two primary challenges: 1) the pedestrian candidates learned
within detectors are suboptimal for the ReID task. 2) the potential for
collaboration between two sub-tasks is overlooked. To address these issues, we
present a novel Person Search framework based on the Diffusion model, PSDiff.
PSDiff formulates the person search as a dual denoising process from noisy
boxes and ReID embeddings to ground truths. Distinct from the conventional
Detection-to-ReID approach, our denoising paradigm discards prior pedestrian
candidates generated by detectors, thereby avoiding the local optimum problem
of the ReID task. Following the new paradigm, we further design a new
Collaborative Denoising Layer (CDL) to optimize detection and ReID sub-tasks in
an iterative and collaborative way, which makes two sub-tasks mutually
beneficial. Extensive experiments on the standard benchmarks show that PSDiff
achieves state-of-the-art performance with fewer parameters and elastic
computing overhead.
Related papers
- OIMNet++: Prototypical Normalization and Localization-aware Learning for
Person Search [34.460973847554364]
We address the task of person search, that is, localizing and re-identifying query persons from a set of raw scene images.
Recent approaches are typically built upon OIMNet, a pioneer work on person search, that learns joint person representations for performing both detection and person re-identification tasks.
We introduce a novel normalization layer, dubbed ProtoNorm, that calibrates features from pedestrian proposals, while considering a long-tail distribution of person IDs.
arXiv Detail & Related papers (2022-07-21T06:34:03Z) - ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries.
We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations.
Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z) - Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods.
In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others.
We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z) - Subtask-dominated Transfer Learning for Long-tail Person Search [12.311100923753449]
Person search unifies person detection and person re-identification (Re-ID) to locate query persons from panoramic gallery images.
One major challenge comes from the imbalanced long-tail person identity distributions.
We propose a Subtask-dominated Transfer Learning (STL) method to solve this problem.
arXiv Detail & Related papers (2021-12-01T14:34:48Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - Multi-object Tracking with a Hierarchical Single-branch Network [31.680667324595557]
We propose an online multi-object tracking framework based on a hierarchical single-branch network.
Our novel iHOIM loss function unifies the objectives of the two sub-tasks and encourages better detection performance.
Experimental results on MOT16 and MOT20 datasets show that we can achieve state-of-the-art tracking performance.
arXiv Detail & Related papers (2021-01-06T12:14:58Z) - Ensemble and Random Collaborative Representation-Based Anomaly Detector
for Hyperspectral Imagery [133.83048723991462]
We propose a novel ensemble and random collaborative representation-based detector (ERCRD) for hyperspectral anomaly detection (HAD)
Our experiments on four real hyperspectral datasets exhibit the accuracy and efficiency of this proposed ERCRD method compared with ten state-of-the-art HAD methods.
arXiv Detail & Related papers (2021-01-06T11:23:51Z) - Diverse Knowledge Distillation for End-to-End Person Search [81.4926655119318]
Person search aims to localize and identify a specific person from a gallery of images.
Recent methods can be categorized into two groups, i.e., two-step and end-to-end approaches.
We propose a simple yet strong end-to-end network with diverse knowledge distillation to break the bottleneck.
arXiv Detail & Related papers (2020-12-21T09:04:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.