Edge Weight Prediction For Category-Agnostic Pose Estimation
- URL: http://arxiv.org/abs/2411.16665v1
- Date: Mon, 25 Nov 2024 18:53:09 GMT
- Title: Edge Weight Prediction For Category-Agnostic Pose Estimation
- Authors: Or Hirschorn, Shai Avidan,
- Abstract summary: Category-Agnostic Pose Estimation (CAPE) localizes keypoints across diverse object categories with a single model.
We introduce EdgeCape, a novel framework that overcomes limitations by predicting the graph's edge weights.
We show that this improves the model's ability to capture global spatial dependencies.
- Score: 12.308036453869033
- License:
- Abstract: Category-Agnostic Pose Estimation (CAPE) localizes keypoints across diverse object categories with a single model, using one or a few annotated support images. Recent works have shown that using a pose graph (i.e., treating keypoints as nodes in a graph rather than isolated points) helps handle occlusions and break symmetry. However, these methods assume a static pose graph with equal-weight edges, leading to suboptimal results. We introduce EdgeCape, a novel framework that overcomes these limitations by predicting the graph's edge weights which optimizes localization. To further leverage structural priors, we propose integrating Markovian Structural Bias, which modulates the self-attention interaction between nodes based on the number of hops between them. We show that this improves the model's ability to capture global spatial dependencies. Evaluated on the MP-100 benchmark, which includes 100 categories and over 20K images, EdgeCape achieves state-of-the-art results in the 1-shot setting and leads among similar-sized methods in the 5-shot setting, significantly improving keypoint localization accuracy. Our code is publicly available.
Related papers
- ScaleNet: Scale Invariance Learning in Directed Graphs [4.235697905699222]
In node classification with Graph Neural Networks (GNNs), it is actually the ego-graph of the center node that is classified.
We propose the concept of scaled ego-graphs'', replacing undirected single-edges with scaled-edges'', which are ordered sequences of multiple directed edges.
Our scale-invariance-based graph learning outperforms inception models derived from random walks by being simpler, faster, and more accurate.
arXiv Detail & Related papers (2024-11-13T16:42:59Z) - KGpose: Keypoint-Graph Driven End-to-End Multi-Object 6D Pose Estimation via Point-Wise Pose Voting [0.0]
KGpose is an end-to-end framework for 6D pose estimation of multiple objects.
Our approach combines keypoint-based method with learnable pose regression through keypoint-graph'
arXiv Detail & Related papers (2024-07-12T01:06:00Z) - CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation [10.951186766576173]
Category-agnostic pose estimation (CAPE) aims to facilitate keypoint localization for diverse object categories.
Our work departs from conventional CAPE methods, by adopting a text-based approach instead of the support image.
We validate our novel approach using the MP-100 benchmark, a comprehensive dataset spanning over 100 categories and 18,000 images.
arXiv Detail & Related papers (2024-06-01T09:50:13Z) - A Graph-Based Approach for Category-Agnostic Pose Estimation [12.308036453869033]
Category-agnostic pose estimation (CAPE) was introduced to enable keypoint localization for arbitrary object categories.
We present a significant departure from conventional CAPE techniques, which treat keypoints as isolated entities, by treating the input pose data as a graph.
Our solution boosts performance by 0.98% under a 1-shot setting, achieving a new state-of-the-art for CAPE.
arXiv Detail & Related papers (2023-11-29T18:44:12Z) - Bring Your Own View: Graph Neural Networks for Link Prediction with
Personalized Subgraph Selection [57.34881616131377]
We introduce a Personalized Subgraph Selector (PS2) as a plug-and-play framework to automatically, personally, and inductively identify optimal subgraphs for different edges.
PS2 is instantiated as a bi-level optimization problem that can be efficiently solved differently.
We suggest a brand-new angle towards GNNLP training: by first identifying the optimal subgraphs for edges; and then focusing on training the inference model by using the sampled subgraphs.
arXiv Detail & Related papers (2022-12-23T17:30:19Z) - Pose for Everything: Towards Category-Agnostic Pose Estimation [93.07415325374761]
Category-Agnostic Pose Estimation (CAPE) aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.
A transformer-based Keypoint Interaction Module (KIM) is proposed to capture both the interactions among different keypoints and the relationship between the support and query images.
We also introduce Multi-category Pose (MP-100) dataset, which is a 2D pose dataset of 100 object categories containing over 20K instances and is well-designed for developing CAPE algorithms.
arXiv Detail & Related papers (2022-07-21T09:40:54Z) - Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement [54.29252286561449]
We propose a two-stage graph-based and model-agnostic framework, called Graph-PCNN.
In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.
In the second stage, for each guided point, different visual feature is extracted by the localization.
The relationship between guided points is explored by the graph pose refinement module to get more accurate localization results.
arXiv Detail & Related papers (2020-07-21T04:59:15Z) - Sequential Graph Convolutional Network for Active Learning [53.99104862192055]
We propose a novel pool-based Active Learning framework constructed on a sequential Graph Convolution Network (GCN)
With a small number of randomly sampled images as seed labelled examples, we learn the parameters of the graph to distinguish labelled vs unlabelled nodes.
We exploit these characteristics of GCN to select the unlabelled examples which are sufficiently different from labelled ones.
arXiv Detail & Related papers (2020-06-18T00:55:10Z) - Self-Supervised Tuning for Few-Shot Segmentation [82.32143982269892]
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples.
Existing meta-learning method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space.
This paper presents an adaptive framework tuning, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme.
arXiv Detail & Related papers (2020-04-12T03:53:53Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.