Pose-Aided Video-based Person Re-Identification via Recurrent Graph
Convolutional Network
- URL: http://arxiv.org/abs/2209.11582v1
- Date: Fri, 23 Sep 2022 13:20:33 GMT
- Title: Pose-Aided Video-based Person Re-Identification via Recurrent Graph
Convolutional Network
- Authors: Honghu Pan, Qiao Liu, Yongyong Chen, Yunqi He, Yuan Zheng, Feng Zheng,
Zhenyu He
- Abstract summary: We propose to learn the discriminative pose feature beyond the appearance feature for video retrieval.
To learn the pose feature, we first detect the pedestrian pose in each frame through an off-the-shelf pose detector.
We then exploit a recurrent graph convolutional network (RGCN) to learn the node embeddings of the temporal pose graph.
- Score: 41.861537712563816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing methods for video-based person re-identification (ReID) mainly learn
the appearance feature of a given pedestrian via a feature extractor and a
feature aggregator.
However, the appearance models would fail when different pedestrians have
similar appearances.
Considering that different pedestrians have different walking postures and
body proportions, we propose to learn the discriminative pose feature beyond
the appearance feature for video retrieval.
Specifically, we implement a two-branch architecture to separately learn the
appearance feature and pose feature, and then concatenate them together for
inference.
To learn the pose feature, we first detect the pedestrian pose in each frame
through an off-the-shelf pose detector, and construct a temporal graph using
the pose sequence.
We then exploit a recurrent graph convolutional network (RGCN) to learn the
node embeddings of the temporal pose graph, which devises a global information
propagation mechanism to simultaneously achieve the neighborhood aggregation of
intra-frame nodes and message passing among inter-frame graphs.
Finally, we propose a dual-attention method consisting of node-attention and
time-attention to obtain the temporal graph representation from the node
embeddings, where the self-attention mechanism is employed to learn the
importance of each node and each frame.
We verify the proposed method on three video-based ReID datasets, i.e., Mars,
DukeMTMC and iLIDS-VID, whose experimental results demonstrate that the learned
pose feature can effectively improve the performance of existing appearance
models.
Related papers
- Occlusion Resilient 3D Human Pose Estimation [52.49366182230432]
Occlusions remain one of the key challenges in 3D body pose estimation from single-camera video sequences.
We demonstrate the effectiveness of this approach compared to state-of-the-art techniques that infer poses from single-camera sequences.
arXiv Detail & Related papers (2024-02-16T19:29:43Z) - DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose
Estimation [16.32910684198013]
We present DiffPose, a novel diffusion architecture that formulates video-based human pose estimation as a conditional heatmap generation problem.
We show two unique characteristics from DiffPose on pose estimation task: (i) the ability to combine multiple sets of pose estimates to improve prediction accuracy, particularly for challenging joints, and (ii) the ability to adjust the number of iterative steps for feature refinement without retraining the model.
arXiv Detail & Related papers (2023-07-31T14:00:23Z) - Bipartite Graph Reasoning GANs for Person Pose and Facial Image
Synthesis [201.39323496042527]
We present a novel bipartite graph reasoning Generative Adversarial Network (BiGraphGAN) for two challenging tasks: person pose and facial image synthesis.
The proposed graph generator consists of two novel blocks that aim to model the pose-to-pose and pose-to-image relations, respectively.
arXiv Detail & Related papers (2022-11-12T18:27:00Z) - PGGANet: Pose Guided Graph Attention Network for Person
Re-identification [0.0]
Person re-identification (ReID) aims at retrieving a person from images captured by different cameras.
It has been proved that using local features together with global feature of person image could help to give robust feature representations for person retrieval.
We propose a pose guided graph attention network, a multi-branch architecture consisting of one branch for global feature, one branch for mid-granular body features and one branch for fine-granular key point features.
arXiv Detail & Related papers (2021-11-29T09:47:39Z) - Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras.
Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement.
In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Graph-based Person Signature for Person Re-Identifications [17.181807593574764]
We propose a new method to effectively aggregate detailed person descriptions (attributes labels) and visual features (body parts and global features) into a graph.
The graph is integrated into a multi-branch multi-task framework for person re-identification.
Our approach achieves competitive results among the state of the art and outperforms other attribute-based or mask-guided methods.
arXiv Detail & Related papers (2021-04-14T10:54:36Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.