Attend to Who You Are: Supervising Self-Attention for Keypoint Detection
and Instance-Aware Association
- URL: http://arxiv.org/abs/2111.12892v1
- Date: Thu, 25 Nov 2021 03:41:41 GMT
- Title: Attend to Who You Are: Supervising Self-Attention for Keypoint Detection
and Instance-Aware Association
- Authors: Sen Yang, Zhicheng Wang, Ze Chen, Yanjie Li, Shoukui Zhang, Zhibin
Quan, Shu-Tao Xia, Yiping Bao, Erjin Zhou, Wankou Yang
- Abstract summary: This paper presents a new method to solve keypoint detection and instance association by using Transformer.
We propose a novel approach of supervising self-attention for multi-person keypoint detection and instance association.
- Score: 40.78849763751773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a new method to solve keypoint detection and instance
association by using Transformer. For bottom-up multi-person pose estimation
models, they need to detect keypoints and learn associative information between
keypoints. We argue that these problems can be entirely solved by Transformer.
Specifically, the self-attention in Transformer measures dependencies between
any pair of locations, which can provide association information for keypoints
grouping. However, the naive attention patterns are still not subjectively
controlled, so there is no guarantee that the keypoints will always attend to
the instances to which they belong. To address it we propose a novel approach
of supervising self-attention for multi-person keypoint detection and instance
association. By using instance masks to supervise self-attention to be
instance-aware, we can assign the detected keypoints to their corresponding
instances based on the pairwise attention scores, without using pre-defined
offset vector fields or embedding like CNN-based bottom-up models. An
additional benefit of our method is that the instance segmentation results of
any number of people can be directly obtained from the supervised attention
matrix, thereby simplifying the pixel assignment pipeline. The experiments on
the COCO multi-person keypoint detection challenge and person instance
segmentation task demonstrate the effectiveness and simplicity of the proposed
method and show a promising way to control self-attention behavior for specific
purposes.
Related papers
- Keypoint Promptable Re-Identification [76.31113049256375]
Occluded Person Re-Identification (ReID) is a metric learning task that involves matching occluded individuals based on their appearance.
We introduce Keypoint Promptable ReID (KPR), a novel formulation of the ReID problem that explicitly complements the input bounding box with a set of semantic keypoints.
We release custom keypoint labels for four popular ReID benchmarks. Experiments on person retrieval, but also on pose tracking, demonstrate that our method systematically surpasses previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-25T15:20:58Z) - TUSK: Task-Agnostic Unsupervised Keypoints [21.777256048659165]
We propose a novel method to learn Task-agnostic, UnSupervised Keypoints (TUSK) which can deal with multiple instances.
Specifically, we encode semantics into the keypoints by teaching them to reconstruct images from a sparse set of keypoints and their descriptors.
This makes our approach amenable to a wider range of tasks than any previous unsupervised keypoint method.
arXiv Detail & Related papers (2022-06-16T21:56:17Z) - From Keypoints to Object Landmarks via Self-Training Correspondence: A
novel approach to Unsupervised Landmark Discovery [37.78933209094847]
This paper proposes a novel paradigm for the unsupervised learning of object landmark detectors.
We validate our method on a variety of difficult datasets, including LS3D, BBCPose, Human3.6M and PennAction.
arXiv Detail & Related papers (2022-05-31T15:44:29Z) - Learning to Detect Instance-level Salient Objects Using Complementary
Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z) - Weakly Supervised Keypoint Discovery [27.750244813890262]
We propose a method for keypoint discovery from a 2D image using image-level supervision.
Motivated by the weakly-supervised learning approach, our method exploits image-level supervision to identify discriminative parts.
Our approach achieves state-of-the-art performance for the task of keypoint estimation on the limited supervision scenarios.
arXiv Detail & Related papers (2021-09-28T01:26:53Z) - UKPGAN: A General Self-Supervised Keypoint Detector [43.35270822722044]
UKPGAN is a general self-supervised 3D keypoint detector.
Our keypoints align well with human annotated keypoint labels.
Our model is stable under both rigid and non-rigid transformations.
arXiv Detail & Related papers (2020-11-24T09:08:21Z) - Weakly-supervised Salient Instance Detection [65.0408760733005]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2020-09-29T09:47:23Z) - A Self-Training Approach for Point-Supervised Object Detection and
Counting in Crowds [54.73161039445703]
We propose a novel self-training approach that enables a typical object detector trained only with point-level annotations.
During training, we utilize the available point annotations to supervise the estimation of the center points of objects.
Experimental results show that our approach significantly outperforms state-of-the-art point-supervised methods under both detection and counting tasks.
arXiv Detail & Related papers (2020-07-25T02:14:42Z) - Point-Set Anchors for Object Detection, Instance Segmentation and Pose
Estimation [85.96410825961966]
We argue that the image features extracted at a central point contain limited information for predicting distant keypoints or bounding box boundaries.
To facilitate inference, we propose to instead perform regression from a set of points placed at more advantageous positions.
We apply this proposed framework, called Point-Set Anchors, to object detection, instance segmentation, and human pose estimation.
arXiv Detail & Related papers (2020-07-06T15:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.