A Graph-Based Approach for Category-Agnostic Pose Estimation
- URL: http://arxiv.org/abs/2311.17891v2
- Date: Thu, 11 Jul 2024 13:00:04 GMT
- Title: A Graph-Based Approach for Category-Agnostic Pose Estimation
- Authors: Or Hirschorn, Shai Avidan,
- Abstract summary: Category-agnostic pose estimation (CAPE) was introduced to enable keypoint localization for arbitrary object categories.
We present a significant departure from conventional CAPE techniques, which treat keypoints as isolated entities, by treating the input pose data as a graph.
Our solution boosts performance by 0.98% under a 1-shot setting, achieving a new state-of-the-art for CAPE.
- Score: 12.308036453869033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional 2D pose estimation models are limited by their category-specific design, making them suitable only for predefined object categories. This restriction becomes particularly challenging when dealing with novel objects due to the lack of relevant training data. To address this limitation, category-agnostic pose estimation (CAPE) was introduced. CAPE aims to enable keypoint localization for arbitrary object categories using a few-shot single model, requiring minimal support images with annotated keypoints. We present a significant departure from conventional CAPE techniques, which treat keypoints as isolated entities, by treating the input pose data as a graph. We leverage the inherent geometrical relations between keypoints through a graph-based network to break symmetry, preserve structure, and better handle occlusions. We validate our approach on the MP-100 benchmark, a comprehensive dataset comprising over 20,000 images spanning over 100 categories. Our solution boosts performance by 0.98% under a 1-shot setting, achieving a new state-of-the-art for CAPE. Additionally, we enhance the dataset with skeleton annotations. Our code and data are publicly available.
Related papers
- Edge Weight Prediction For Category-Agnostic Pose Estimation [12.308036453869033]
Category-Agnostic Pose Estimation (CAPE) localizes keypoints across diverse object categories with a single model.
We introduce EdgeCape, a novel framework that overcomes limitations by predicting the graph's edge weights.
We show that this improves the model's ability to capture global spatial dependencies.
arXiv Detail & Related papers (2024-11-25T18:53:09Z) - Physically Feasible Semantic Segmentation [58.17907376475596]
State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion.
Our method, Physically Feasible Semantic (PhyFea), extracts explicit physical constraints that govern spatial class relations.
PhyFea yields significant performance improvements in mIoU over each state-of-the-art network we use.
arXiv Detail & Related papers (2024-08-26T22:39:08Z) - SCAPE: A Simple and Strong Category-Agnostic Pose Estimator [6.705257644513057]
Category-Agnostic Pose Estimation (CAPE) aims to localize keypoints on an object of any category given few exemplars in an in-context manner.
We introduce two key modules: a global keypoint feature perceptor to inject global semantic information into support keypoints, and a keypoint attention refiner to enhance inter-node correlation between keypoints.
SCAPE outperforms prior arts by 2.2 and 1.3 PCK under 1-shot and 5-shot settings with faster inference speed and lighter model capacity.
arXiv Detail & Related papers (2024-07-18T13:02:57Z) - CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation [10.951186766576173]
Category-agnostic pose estimation (CAPE) aims to facilitate keypoint localization for diverse object categories.
Our work departs from conventional CAPE methods, by adopting a text-based approach instead of the support image.
We validate our novel approach using the MP-100 benchmark, a comprehensive dataset spanning over 100 categories and 18,000 images.
arXiv Detail & Related papers (2024-06-01T09:50:13Z) - Studying How to Efficiently and Effectively Guide Models with Explanations [52.498055901649025]
'Model guidance' is the idea of regularizing the models' explanations to ensure that they are "right for the right reasons"
We conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets.
Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks.
arXiv Detail & Related papers (2023-03-21T15:34:50Z) - Pose for Everything: Towards Category-Agnostic Pose Estimation [93.07415325374761]
Category-Agnostic Pose Estimation (CAPE) aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.
A transformer-based Keypoint Interaction Module (KIM) is proposed to capture both the interactions among different keypoints and the relationship between the support and query images.
We also introduce Multi-category Pose (MP-100) dataset, which is a 2D pose dataset of 100 object categories containing over 20K instances and is well-designed for developing CAPE algorithms.
arXiv Detail & Related papers (2022-07-21T09:40:54Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Objectness-Aware Few-Shot Semantic Segmentation [31.13009111054977]
We show how to increase overall model capacity to achieve improved performance.
We introduce objectness, which is class-agnostic and so not prone to overfitting.
Given only one annotated example of an unseen category, experiments show that our method outperforms state-of-art methods with respect to mIoU.
arXiv Detail & Related papers (2020-04-06T19:12:08Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.