Joint Multi-Person Body Detection and Orientation Estimation via One
Unified Embedding
- URL: http://arxiv.org/abs/2210.15586v1
- Date: Thu, 27 Oct 2022 16:22:50 GMT
- Title: Joint Multi-Person Body Detection and Orientation Estimation via One
Unified Embedding
- Authors: Huayi Zhou, Fei Jiang, Jiaxin Si, Hongtao Lu
- Abstract summary: We propose a single-stage end-to-end trainable framework for tackling the HBOE problem with multi-persons.
By integrating the prediction of bounding boxes and direction angles in one embedding, our method can jointly estimate the location and orientation of all bodies in one image.
- Score: 24.96237908232171
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human body orientation estimation (HBOE) is widely applied into various
applications, including robotics, surveillance, pedestrian analysis and
autonomous driving. Although many approaches have been addressing the HBOE
problem from specific under-controlled scenes to challenging in-the-wild
environments, they assume human instances are already detected and take a well
cropped sub-image as the input. This setting is less efficient and prone to
errors in real application, such as crowds of people. In the paper, we propose
a single-stage end-to-end trainable framework for tackling the HBOE problem
with multi-persons. By integrating the prediction of bounding boxes and
direction angles in one embedding, our method can jointly estimate the location
and orientation of all bodies in one image directly. Our key idea is to
integrate the HBOE task into the multi-scale anchor channel predictions of
persons for concurrently benefiting from engaged intermediate features.
Therefore, our approach can naturally adapt to difficult instances involving
low resolution and occlusion as in object detection. We validated the
efficiency and effectiveness of our method in the recently presented benchmark
MEBOW with extensive experiments. Besides, we completed ambiguous instances
ignored by the MEBOW dataset, and provided corresponding weak body-orientation
labels to keep the integrity and consistency of it for supporting studies
toward multi-persons. Our work is available at
\url{https://github.com/hnuzhy/JointBDOE}.
Related papers
- Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection [14.837853049121687]
3D object detection leveraging multi-view cameras has demonstrated their practical and economical value in challenging vision tasks.
Typical supervised learning approaches face challenges in achieving satisfactory adaptation toward unseen and unlabeled target datasets.
We propose Unified Domain Generalization and Adaptation (UDGA), a practical solution to mitigate those drawbacks.
arXiv Detail & Related papers (2024-10-29T18:51:49Z) - PBADet: A One-Stage Anchor-Free Approach for Part-Body Association [30.6652836585336]
PBADet is a one-stage, anchor-free approach for part-body association detection.
Our design is inherently versatile and capable of managing multiple parts-to-body associations.
arXiv Detail & Related papers (2024-02-12T17:18:51Z) - Generalizable Person Search on Open-world User-Generated Video Content [93.72028298712118]
Person search is a challenging task that involves retrieving individuals from a large set of un-cropped scene images.
Existing person search applications are mostly trained and deployed in the same-origin scenarios.
We propose a generalizable framework on both feature-level and data-level generalization to facilitate downstream tasks in arbitrary scenarios.
arXiv Detail & Related papers (2023-10-16T04:59:50Z) - Multi-view Tracking Using Weakly Supervised Human Motion Prediction [60.972708589814125]
We argue that an even more effective approach is to predict people motion over time and infer people's presence in individual frames from these.
This enables to enforce consistency both over time and across views of a single temporal frame.
We validate our approach on the PETS2009 and WILDTRACK datasets and demonstrate that it outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-10-19T17:58:23Z) - InsPose: Instance-Aware Networks for Single-Stage Multi-Person Pose
Estimation [37.80984212500406]
We present a simple yet effective solution by employing instance-aware dynamic networks.
Specifically, we propose an instance-aware module to adaptively adjust (part of) the network parameters for each instance.
Our solution can significantly increase the capacity and adaptive-ability of the network for recognizing various poses, while maintaining a compact end-to-end trainable pipeline.
arXiv Detail & Related papers (2021-07-19T15:56:09Z) - A Global to Local Double Embedding Method for Multi-person Pose
Estimation [10.05687757555923]
We present a novel method to simplify the pipeline by implementing person detection and joints detection simultaneously.
We propose a Double Embedding (DE) method to complete the multi-person pose estimation task in a global-to-local way.
We achieve the competitive results on benchmarks MSCOCO, MPII and CrowdPose, demonstrating the effectiveness and generalization ability of our method.
arXiv Detail & Related papers (2021-02-15T03:13:38Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in
the Wild [77.43884383743872]
We present AdaFuse, an adaptive multiview fusion method to enhance the features in occluded views.
We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic.
We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints.
arXiv Detail & Related papers (2020-10-26T03:19:46Z) - DIRV: Dense Interaction Region Voting for End-to-End Human-Object
Interaction Detection [53.40028068801092]
We propose a novel one-stage HOI detection approach based on a new concept called interaction region for the HOI problem.
Unlike previous methods, our approach concentrates on the densely sampled interaction regions across different scales for each human-object pair.
In order to compensate for the detection flaws of a single interaction region, we introduce a novel voting strategy.
arXiv Detail & Related papers (2020-10-02T13:57:58Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.