Improving Multi-Person Pose Tracking with A Confidence Network
- URL: http://arxiv.org/abs/2310.18920v1
- Date: Sun, 29 Oct 2023 06:36:27 GMT
- Title: Improving Multi-Person Pose Tracking with A Confidence Network
- Authors: Zehua Fu, Wenhang Zuo, Zhenghui Hu, Qingjie Liu, Yunhong Wang
- Abstract summary: We develop a novel keypoint confidence network and a tracking pipeline to improve human detection and pose estimation.
Specifically, the keypoint confidence network is designed to determine whether each keypoint is occluded.
In the tracking pipeline, we propose the Bbox-revision module to reduce missing detection and the ID-retrieve module to correct lost trajectories.
- Score: 37.84514614455588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose estimation and tracking are fundamental tasks for understanding
human behaviors in videos. Existing top-down framework-based methods usually
perform three-stage tasks: human detection, pose estimation and tracking.
Although promising results have been achieved, these methods rely heavily on
high-performance detectors and may fail to track persons who are occluded or
miss-detected. To overcome these problems, in this paper, we develop a novel
keypoint confidence network and a tracking pipeline to improve human detection
and pose estimation in top-down approaches. Specifically, the keypoint
confidence network is designed to determine whether each keypoint is occluded,
and it is incorporated into the pose estimation module. In the tracking
pipeline, we propose the Bbox-revision module to reduce missing detection and
the ID-retrieve module to correct lost trajectories, improving the performance
of the detection stage. Experimental results show that our approach is
universal in human detection and pose estimation, achieving state-of-the-art
performance on both PoseTrack 2017 and 2018 datasets.
Related papers
- RTracker: Recoverable Tracking via PN Tree Structured Memory [71.05904715104411]
We propose a recoverable tracking framework, RTracker, that uses a tree-structured memory to dynamically associate a tracker and a detector to enable self-recovery.
Specifically, we propose a Positive-Negative Tree-structured memory to chronologically store and maintain positive and negative target samples.
Our core idea is to use the support samples of positive and negative target categories to establish a relative distance-based criterion for a reliable assessment of target loss.
arXiv Detail & Related papers (2024-03-28T08:54:40Z) - Human Pose-based Estimation, Tracking and Action Recognition with Deep
Learning: A Survey [15.920237822185301]
This paper presents a survey of pose-based applications utilizing deep learning, encompassing pose estimation, pose tracking, and action recognition.
Pose estimation involves the determination of human joint positions from images or image sequences.
Pose tracking is an emerging research direction aimed at generating consistent human pose trajectories over time.
Action recognition targets the identification of action types using pose estimation or tracking data.
arXiv Detail & Related papers (2023-10-19T17:59:04Z) - Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale
Persons [75.86463396561744]
In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all persons.
Our method achieves 38.4% improvement on bounding box precision and 39.1% improvement on bounding box recall over the state of the art (SOTA)
For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing.
arXiv Detail & Related papers (2022-08-25T10:09:10Z) - Dual networks based 3D Multi-Person Pose Estimation from Monocular Video [42.01876518017639]
Multi-person 3D pose estimation is more challenging than single pose estimation.
Existing top-down and bottom-up approaches to pose estimation suffer from detection errors.
We propose the integration of top-down and bottom-up approaches to exploit their strengths.
arXiv Detail & Related papers (2022-05-02T08:53:38Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and
Bottom-Up Networks [33.974241749058585]
Multi-person pose estimation can cause human detection to be erroneous and human-joints grouping to be unreliable.
Existing top-down methods rely on human detection and thus suffer from these problems.
We propose the integration of top-down and bottom-up approaches to exploit their strengths.
arXiv Detail & Related papers (2021-04-05T07:05:21Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z) - Uncertainty-Aware Voxel based 3D Object Detection and Tracking with
von-Mises Loss [13.346392746224117]
Uncertainty helps us tackle the error in the perception system and improve robustness.
We propose a method for improving target tracking performance by adding uncertainty regression to the SECOND detector.
arXiv Detail & Related papers (2020-11-04T21:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.