Learning Multi-Granular Hypergraphs for Video-Based Person
Re-Identification
- URL: http://arxiv.org/abs/2104.14913v1
- Date: Fri, 30 Apr 2021 11:20:02 GMT
- Title: Learning Multi-Granular Hypergraphs for Video-Based Person
Re-Identification
- Authors: Yichao Yan, Jie Qin1, Jiaxin Chen, Li Liu, Fan Zhu, Ying Tai, Ling
Shao
- Abstract summary: Video-based person re-identification (re-ID) is an important research topic in computer vision.
We propose a novel graph-based framework, namely Multi-Granular Hypergraph (MGH) to better representational capabilities.
90.0% top-1 accuracy on MARS is achieved using MGH, outperforming the state-of-the-arts schemes.
- Score: 110.52328716130022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video-based person re-identification (re-ID) is an important research topic
in computer vision. The key to tackling the challenging task is to exploit both
spatial and temporal clues in video sequences. In this work, we propose a novel
graph-based framework, namely Multi-Granular Hypergraph (MGH), to pursue better
representational capabilities by modeling spatiotemporal dependencies in terms
of multiple granularities. Specifically, hypergraphs with different spatial
granularities are constructed using various levels of part-based features
across the video sequence. In each hypergraph, different temporal granularities
are captured by hyperedges that connect a set of graph nodes (i.e., part-based
features) across different temporal ranges. Two critical issues (misalignment
and occlusion) are explicitly addressed by the proposed hypergraph propagation
and feature aggregation schemes. Finally, we further enhance the overall video
representation by learning more diversified graph-level representations of
multiple granularities based on mutual information minimization. Extensive
experiments on three widely adopted benchmarks clearly demonstrate the
effectiveness of the proposed framework. Notably, 90.0% top-1 accuracy on MARS
is achieved using MGH, outperforming the state-of-the-arts. Code is available
at https://github.com/daodaofr/hypergraph_reid.
Related papers
- Hypergraph Transformer for Semi-Supervised Classification [50.92027313775934]
We propose a novel hypergraph learning framework, HyperGraph Transformer (HyperGT)
HyperGT uses a Transformer-based neural network architecture to effectively consider global correlations among all nodes and hyperedges.
It achieves comprehensive hypergraph representation learning by effectively incorporating global interactions while preserving local connectivity patterns.
arXiv Detail & Related papers (2023-12-18T17:50:52Z) - Multi-Granularity Graph Pooling for Video-based Person Re-Identification [14.943835935921296]
graph neural networks (GNNs) are introduced to aggregate temporal and spatial features of video samples.
Existing graph-based models, like STGCN, perform the textitmean/textitmax pooling on node features to obtain the graph representation.
We propose the graph pooling network (GPNet) to learn the multi-granularity graph representation for the video retrieval.
arXiv Detail & Related papers (2022-09-23T13:26:05Z) - Representing Videos as Discriminative Sub-graphs for Action Recognition [165.54738402505194]
We introduce a new design of sub-graphs to represent and encode theriminative patterns of each action in the videos.
We present MUlti-scale Sub-Earn Ling (MUSLE) framework that novelly builds space-time graphs and clusters into compact sub-graphs on each scale.
arXiv Detail & Related papers (2022-01-11T16:15:25Z) - A Hierarchical Spatio-Temporal Graph Convolutional Neural Network for
Anomaly Detection in Videos [11.423072255384469]
We propose a Hierarchical Spatio-Temporal Graph Convolutional Neural Network (HSTGCNN) to address these problems.
HSTGCNN is composed of multiple branches that correspond to different levels of graph representations.
High-level graph representations are assigned higher weights to encode moving speed and directions of people in low-resolution videos while low-level graph representations are assigned higher weights to encode human skeletons in high-resolution videos.
arXiv Detail & Related papers (2021-12-08T14:03:33Z) - Residual Enhanced Multi-Hypergraph Neural Network [26.42547421121713]
HyperGraph Neural Network (HGNN) is the de-facto method for hypergraph representation learning.
We propose the Residual enhanced Multi-Hypergraph Neural Network, which can fuse multi-modal information from each hypergraph effectively.
arXiv Detail & Related papers (2021-05-02T14:53:32Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Spatial-spectral Hyperspectral Image Classification via Multiple Random
Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE)
Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region.
Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z) - Multi-Granularity Reference-Aided Attentive Feature Aggregation for
Video-based Person Re-identification [98.7585431239291]
Video-based person re-identification aims at matching the same person across video clips.
In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-Attentive Feature aggregation module MG-RAFA.
Our framework achieves the state-of-the-art ablation performance on three benchmark datasets.
arXiv Detail & Related papers (2020-03-27T03:49:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.