HumanBench: Towards General Human-centric Perception with Projector
Assisted Pretraining
- URL: http://arxiv.org/abs/2303.05675v1
- Date: Fri, 10 Mar 2023 02:57:07 GMT
- Title: HumanBench: Towards General Human-centric Perception with Projector
Assisted Pretraining
- Authors: Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang,
Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang
- Abstract summary: It is desirable to have a general pretrain model for versatile human-centric downstream tasks.
We propose a textbfHumanBench based on existing datasets to evaluate on the common ground the generalization abilities of different pretraining methods.
Our PATH achieves new state-of-the-art results on 17 downstream datasets and on-par results on the other 2 datasets.
- Score: 75.1086193340286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human-centric perceptions include a variety of vision tasks, which have
widespread industrial applications, including surveillance, autonomous driving,
and the metaverse. It is desirable to have a general pretrain model for
versatile human-centric downstream tasks. This paper forges ahead along this
path from the aspects of both benchmark and pretraining methods. Specifically,
we propose a \textbf{HumanBench} based on existing datasets to comprehensively
evaluate on the common ground the generalization abilities of different
pretraining methods on 19 datasets from 6 diverse downstream tasks, including
person ReID, pose estimation, human parsing, pedestrian attribute recognition,
pedestrian detection, and crowd counting. To learn both coarse-grained and
fine-grained knowledge in human bodies, we further propose a \textbf{P}rojector
\textbf{A}ssis\textbf{T}ed \textbf{H}ierarchical pretraining method
(\textbf{PATH}) to learn diverse knowledge at different granularity levels.
Comprehensive evaluations on HumanBench show that our PATH achieves new
state-of-the-art results on 17 downstream datasets and on-par results on the
other 2 datasets. The code will be publicly at
\href{https://github.com/OpenGVLab/HumanBench}{https://github.com/OpenGVLab/HumanBench}.
Related papers
- Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - A Unified Framework for Human-centric Point Cloud Video Understanding [23.91555808792291]
Human-centric Point Cloud Video Understanding (PVU) is an emerging field focused on extracting and interpreting human-related features from sequences of human point clouds.
We propose a unified framework to make full use of the prior knowledge and explore the inherent features in the data itself for generalized human-centric point cloud video understanding.
Our method achieves state-of-the-art performance on various human-related tasks, including action recognition and 3D pose estimation.
arXiv Detail & Related papers (2024-03-29T07:53:06Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - Deep Learning Technique for Human Parsing: A Survey and Outlook [5.236995853909988]
In this survey, we comprehensively review three core sub-tasks: single human parsing, multiple human parsing, and video human parsing.
We put forward a transformer-based human parsing framework, providing a high-performance baseline for follow-up research.
We point out a set of under-investigated open issues in this field and suggest new directions for future study.
arXiv Detail & Related papers (2023-01-01T12:39:57Z) - Video-based Pose-Estimation Data as Source for Transfer Learning in
Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments.
Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data.
This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Few-Shot Visual Grounding for Natural Human-Robot Interaction [0.0]
We propose a software architecture that segments a target object from a crowded scene, indicated verbally by a human user.
At the core of our system, we employ a multi-modal deep neural network for visual grounding.
We evaluate the performance of the proposed model on real RGB-D data collected from public scene datasets.
arXiv Detail & Related papers (2021-03-17T15:24:02Z) - Whole-Body Human Pose Estimation in the Wild [88.09875133989155]
COCO-WholeBody extends COCO dataset with whole-body annotations.
It is the first benchmark that has manual annotations on the entire human body.
A single-network model, named ZoomNet, is devised to take into account the hierarchical structure of the full human body.
arXiv Detail & Related papers (2020-07-23T08:35:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.