Related papers: Refinement Module based on Parse Graph of Feature Map for Human Pose Estimation

Refinement Module based on Parse Graph of Feature Map for Human Pose Estimation

URL: http://arxiv.org/abs/2501.11069v4
Date: Thu, 13 Mar 2025 02:41:37 GMT
Title: Refinement Module based on Parse Graph of Feature Map for Human Pose Estimation
Authors: Shibang Liu, Xuemei Xie, Guangming Shi,
Abstract summary: Parse graphs of the human body can be obtained to help humans complete the human Pose Estimation better.<n>We design a Refinement Module based on the Parse Graph of feature map (RMPG), which includes two stages: top-down decomposition and bottom-up combination.<n>Our network achieves excellent results on multiple mainstream human pose datasets.
Score: 31.603231536312688
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Parse graphs of the human body can be obtained in the human brain to help humans complete the human Pose Estimation better (HPE). It contains a hierarchical structure, like a tree structure, and context relations among nodes. To equip models with such capabilities, many researchers predefine the parse graph of body structure to design HPE frameworks. However, these frameworks struggle to adapt to instances that deviate from the predefined parse graph and they are often parameter-heavy. Unlike them, we view the feature map holistically, much like the human body. It can be optimized using parse graphs, where nodes' implicit feature representation boosts adaptability, avoiding rigid structural limitations. In this paper, we design the Refinement Module based on the Parse Graph of feature map (RMPG), which includes two stages: top-down decomposition and bottom-up combination. In the first stage, the feature map is constructed into a tree structure through recursive decomposition, with each node representing a sub-feature map, thereby achieving hierarchical modeling of features. In the second stage, context information is calculated and sub-feature maps with context are recursively connected to gradually build a refined feature map. Additionally, we design a hierarchical network with fewer parameters using multiple RMPG modules to model the context relations and hierarchies in the parse graph of body structure for HPE, some of which are supervised to obtain context relations among body parts. Our network achieves excellent results on multiple mainstream human pose datasets and the effectiveness of RMPG is proven on different methods. The code of RMPG will be open.

Related papers

MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction [40.07726377230152]
High-definition (HD) map provides abundant and precise static environmental information of the driving scene. We present textbfMap textbfTRansformer, an end-to-end framework for online vectorized HD map construction.
arXiv Detail & Related papers (2023-08-10T17:56:53Z)
Integrating Human Parsing and Pose Network for Human Action Recognition [12.308394270240463]
We introduce human parsing feature map as a novel modality for action recognition. We propose Integrating Human Parsing and Pose Network (IPP-Net) for action recognition. IPP-Net is the first to leverage both skeletons and human parsing feature maps dualbranch approach.
arXiv Detail & Related papers (2023-07-16T07:58:29Z)
GrannGAN: Graph annotation generative adversarial networks [72.66289932625742]
We consider the problem of modelling high-dimensional distributions and generating new examples of data with complex relational feature structure coherent with a graph skeleton. The model we propose tackles the problem of generating the data features constrained by the specific graph structure of each data point by splitting the task into two phases. In the first it models the distribution of features associated with the nodes of the given graph, in the second it complements the edge features conditionally on the node features.
arXiv Detail & Related papers (2022-12-01T11:49:07Z)
SPGP: Structure Prototype Guided Graph Pooling [1.3764085113103217]
We propose Structure Prototype Guided Pooling (SPGP) for learning graph-level representations. SPGP formulates graph structures as learnable prototype vectors and computes the affinity between nodes and prototype vectors. Our experimental results show that SPGP outperforms state-of-the-art graph pooling methods on graph classification benchmark datasets.
arXiv Detail & Related papers (2022-09-16T09:33:09Z)
Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations. We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions. Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z)
Graph Spectral Embedding using the Geodesic Betweeness Centrality [76.27138343125985]
We introduce the Graph Sylvester Embedding (GSE), an unsupervised graph representation of local similarity, connectivity, and global structure. GSE uses the solution of the Sylvester equation to capture both network structure and neighborhood proximity in a single representation.
arXiv Detail & Related papers (2022-05-07T04:11:23Z)
GraphDCA -- a Framework for Node Distribution Comparison in Real and Synthetic Graphs [72.51835626235368]
We argue that when comparing two graphs, the distribution of node structural features is more informative than global graph statistics. We present GraphDCA - a framework for evaluating similarity between graphs based on the alignment of their respective node representation sets.
arXiv Detail & Related papers (2022-02-08T14:19:19Z)
Compositionality-Aware Graph2Seq Learning [2.127049691404299]
compositionality in a graph can be associated to the compositionality in the output sequence in many graph2seq tasks. We adopt the multi-level attention pooling (MLAP) architecture, that can aggregate graph representations from multiple levels of information localities. We demonstrate that the model having the MLAP architecture outperform the previous state-of-the-art model with more than seven times fewer parameters.
arXiv Detail & Related papers (2022-01-28T15:22:39Z)
Hierarchical Graph Networks for 3D Human Pose Estimation [50.600944798627786]
Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton. We argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem. We propose a novel graph convolution network architecture, Hierarchical Graph Networks, to overcome these weaknesses.
arXiv Detail & Related papers (2021-11-23T15:09:03Z)
Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation. And we propose three task-specific graph neural networks for effective message passing. Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z)
Tree Decomposition Attention for AMR-to-Text Generation [12.342043849587613]
We use a graph's tree decomposition to constrain self-attention in a graph. We apply dynamic programming to derive a forest of tree decompositions, choosing the most structurally similar tree to the AMR. Our system outperforms a self-attentive baseline by 1.6 BLEU and 1.8 chrF++.
arXiv Detail & Related papers (2021-08-27T14:24:25Z)
Multi-Level Graph Encoding with Structural-Collaborative Relation Learning for Skeleton-Based Person Re-Identification [11.303008512400893]
Skeleton-based person re-identification (Re-ID) is an emerging open topic providing great value for safety-critical applications. Existing methods typically extract hand-crafted features or model skeleton dynamics from the trajectory of body joints. We propose a Multi-level Graph encoding approach with Structural-Collaborative Relation learning (MG-SCR) to encode discriminative graph features for person Re-ID.
arXiv Detail & Related papers (2021-06-06T09:09:57Z)
Learning Spatial Context with Graph Neural Network for Multi-Person Pose Grouping [71.59494156155309]
Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping. In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN) The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
arXiv Detail & Related papers (2021-04-06T09:21:14Z)
Structural Adapters in Pretrained Language Models for AMR-to-text Generation [59.50420985074769]
Previous work on text generation from graph-structured data relies on pretrained language models (PLMs) We propose StructAdapt, an adapter method to encode graph structure into PLMs.
arXiv Detail & Related papers (2021-03-16T15:06:50Z)
Accurate Learning of Graph Representations with Graph Multiset Pooling [45.72542969364438]
We propose a Graph Multiset Transformer (GMT) that captures the interaction between nodes according to their structural dependencies. Our experimental results show that GMT significantly outperforms state-of-the-art graph pooling methods on graph classification benchmarks.
arXiv Detail & Related papers (2021-02-23T07:45:58Z)
Hierarchical Graph Capsule Network [78.4325268572233]
We propose hierarchical graph capsule network (HGCN) that can jointly learn node embeddings and extract graph hierarchies. To learn the hierarchical representation, HGCN characterizes the part-whole relationship between lower-level capsules (part) and higher-level capsules (whole)
arXiv Detail & Related papers (2020-12-16T04:13:26Z)
HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation [20.148175528691905]
This paper presents a novel structure-aware embedding-to-classifier(SEC) module to incorporate both local and global structural information of relationships into the output space. We also propose a hierarchical semantic aggregation(HSA) module to reduce the number of subspaces by introducing higher order structural information. The proposed HOSE-Net achieves the state-of-the-art performance on two popular benchmarks of Visual Genome and VRD.
arXiv Detail & Related papers (2020-08-12T07:58:13Z)
Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement [54.29252286561449]
We propose a two-stage graph-based and model-agnostic framework, called Graph-PCNN. In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled. In the second stage, for each guided point, different visual feature is extracted by the localization. The relationship between guided points is explored by the graph pose refinement module to get more accurate localization results.
arXiv Detail & Related papers (2020-07-21T04:59:15Z)
Graph Neural Networks with Composite Kernels [60.81504431653264]
We re-interpret node aggregation from the perspective of kernel weighting. We present a framework to consider feature similarity in an aggregation scheme. We propose feature aggregation as the composition of the original neighbor-based kernel and a learnable kernel to encode feature similarities in a feature space.
arXiv Detail & Related papers (2020-05-16T04:44:29Z)
Iterative Context-Aware Graph Inference for Visual Dialog [126.016187323249]
We propose a novel Context-Aware Graph (CAG) neural network. Each node in the graph corresponds to a joint semantic feature, including both object-based (visual) and history-related (textual) context representations.
arXiv Detail & Related papers (2020-04-05T13:09:37Z)
Hierarchical Human Parsing with Typed Part-Relation Reasoning [179.64978033077222]
How to model human structures is the central theme in this task. We seek to simultaneously exploit the representational capacity of deep graph networks and the hierarchical human structures.
arXiv Detail & Related papers (2020-03-10T16:45:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.