Learning Spatial Context with Graph Neural Network for Multi-Person Pose
Grouping
- URL: http://arxiv.org/abs/2104.02385v1
- Date: Tue, 6 Apr 2021 09:21:14 GMT
- Title: Learning Spatial Context with Graph Neural Network for Multi-Person Pose
Grouping
- Authors: Jiahao Lin, Gim Hee Lee
- Abstract summary: Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping.
In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN)
The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
- Score: 71.59494156155309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bottom-up approaches for image-based multi-person pose estimation consist of
two stages: (1) keypoint detection and (2) grouping of the detected keypoints
to form person instances. Current grouping approaches rely on learned embedding
from only visual features that completely ignore the spatial configuration of
human poses. In this work, we formulate the grouping task as a graph
partitioning problem, where we learn the affinity matrix with a Graph Neural
Network (GNN). More specifically, we design a Geometry-aware Association GNN
that utilizes spatial information of the keypoints and learns local affinity
from the global context. The learned geometry-based affinity is further fused
with appearance-based affinity to achieve robust keypoint association. Spectral
clustering is used to partition the graph for the formation of the pose
instances. Experimental results on two benchmark datasets show that our
proposed method outperforms existing appearance-only grouping frameworks, which
shows the effectiveness of utilizing spatial context for robust grouping.
Source code is available at: https://github.com/jiahaoLjh/PoseGrouping.
Related papers
- Unified and Dynamic Graph for Temporal Character Grouping in Long Videos [31.192044026127032]
Video temporal character grouping locates appearing moments of major characters within a video according to their identities.
Recent works have evolved from unsupervised clustering to graph-based supervised clustering.
We present a unified and dynamic graph (UniDG) framework for temporal character grouping.
arXiv Detail & Related papers (2023-08-27T13:22:55Z) - Image as Set of Points [60.30495338399321]
Context clusters (CoCs) view an image as a set of unorganized points and extract features via simplified clustering algorithm.
Our CoCs are convolution- and attention-free, and only rely on clustering algorithm for spatial interaction.
arXiv Detail & Related papers (2023-03-02T18:56:39Z) - Dual Contrastive Attributed Graph Clustering Network [6.796682703663566]
We propose a generic framework called Dual Contrastive Attributed Graph Clustering Network (DCAGC)
In DCAGC, by leveraging Neighborhood Contrast Module, the similarity of the neighbor nodes will be maximized and the quality of the node representation will be improved.
All the modules of DCAGC are trained and optimized in a unified framework, so the learned node representation contains clustering-oriented messages.
arXiv Detail & Related papers (2022-06-16T03:17:01Z) - GROWL: Group Detection With Link Prediction [0.0]
We propose a holistic approach to group detection based on Graph Neural Networks (GNNs)
Our proposed method, GROup detection With Link prediction, demonstrates the effectiveness of a GNN based approach.
Our results show that a GNN based approach can significantly improve accuracy across different camera views.
arXiv Detail & Related papers (2021-11-08T11:52:48Z) - Improving Facial Attribute Recognition by Group and Graph Learning [34.39507051712628]
Exploiting the relationships between attributes is a key challenge for improving facial attribute recognition.
In this work, we are concerned with two types of correlations that are spatial and non-spatial relationships.
We propose a unified network called Multi-scale Group and Graph Network.
arXiv Detail & Related papers (2021-05-28T13:36:28Z) - Spatial-spectral Hyperspectral Image Classification via Multiple Random
Anchor Graphs Ensemble Learning [88.60285937702304]
This paper proposes a novel spatial-spectral HSI classification method via multiple random anchor graphs ensemble learning (RAGE)
Firstly, the local binary pattern is adopted to extract the more descriptive features on each selected band, which preserves local structures and subtle changes of a region.
Secondly, the adaptive neighbors assignment is introduced in the construction of anchor graph, to reduce the computational complexity.
arXiv Detail & Related papers (2021-03-25T09:31:41Z) - Structured Graph Learning for Scalable Subspace Clustering: From
Single-view to Multi-view [28.779909990410978]
Graph-based subspace clustering methods have exhibited promising performance.
They still suffer some of these drawbacks: encounter the expensive time overhead, fail in exploring the explicit clusters, and cannot generalize to unseen data points.
We propose a scalable graph learning framework, seeking to address the above three challenges simultaneously.
arXiv Detail & Related papers (2021-02-16T03:46:11Z) - Differentiable Hierarchical Graph Grouping for Multi-Person Pose
Estimation [95.72606536493548]
Multi-person pose estimation is challenging because it localizes body keypoints for multiple persons simultaneously.
We propose a novel differentiable Hierarchical Graph Grouping (HGG) method to learn the graph grouping in bottom-up multi-person pose estimation task.
arXiv Detail & Related papers (2020-07-23T08:46:22Z) - Learning to Cluster Faces via Confidence and Connectivity Estimation [136.5291151775236]
We propose a fully learnable clustering framework without requiring a large number of overlapped subgraphs.
Our method significantly improves clustering accuracy and thus performance of the recognition models trained on top, yet it is an order of magnitude more efficient than existing supervised methods.
arXiv Detail & Related papers (2020-04-01T13:39:37Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.