UET-Headpose: A sensor-based top-view head pose dataset
- URL: http://arxiv.org/abs/2111.07039v1
- Date: Sat, 13 Nov 2021 04:54:20 GMT
- Title: UET-Headpose: A sensor-based top-view head pose dataset
- Authors: Linh Nguyen Viet, Tuan Nguyen Dinh, Hoang Nguyen Viet, Duc Tran Minh,
Long Tran Quoc
- Abstract summary: We introduce a new approach with efficient cost and easy setup to collecting head pose images.
This method uses an absolute orientation sensor instead of Depth cameras to be set up quickly and small cost.
We also introduce the full-range model called FSANet-Wide, which significantly outperforms head pose estimation results by the UET-Headpose dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Head pose estimation is a challenging task that aims to solve problems
related to predicting three dimensions vector, that serves for many
applications in human-robot interaction or customer behavior. Previous
researches have proposed some precise methods for collecting head pose data.
But those methods require either expensive devices like depth cameras or
complex laboratory environment setup. In this research, we introduce a new
approach with efficient cost and easy setup to collecting head pose images,
namely UET-Headpose dataset, with top-view head pose data. This method uses an
absolute orientation sensor instead of Depth cameras to be set up quickly and
small cost but still ensure good results. Through experiments, our dataset has
been shown the difference between its distribution and available dataset like
CMU Panoptic Dataset \cite{CMU}. Besides using the UET-Headpose dataset and
other head pose datasets, we also introduce the full-range model called
FSANet-Wide, which significantly outperforms head pose estimation results by
the UET-Headpose dataset, especially on top-view images. Also, this model is
very lightweight and takes small size images.
Related papers
- CameraHMR: Aligning People with Perspective [54.05758012879385]
We address the challenge of accurate 3D human pose and shape estimation from monocular images.
Existing training datasets containing real images with pseudo ground truth (pGT) use SMPLify to fit SMPL to sparse 2D joint locations.
We make two contributions that improve pGT accuracy.
arXiv Detail & Related papers (2024-11-12T19:12:12Z) - DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - FoVA-Depth: Field-of-View Agnostic Depth Estimation for Cross-Dataset
Generalization [57.98448472585241]
We propose a method to train a stereo depth estimation model on the widely available pinhole data.
We show strong generalization ability of our approach on both indoor and outdoor datasets.
arXiv Detail & Related papers (2024-01-24T20:07:59Z) - ILSH: The Imperial Light-Stage Head Dataset for Human Head View
Synthesis [42.81410101705251]
Imperial Light-Stage Head dataset is a novel dataset designed to support view synthesis academic challenges for human heads.
This paper details the setup of a light-stage specifically designed to capture high-resolution (4K) human head images.
In addition to the data collection, we address the split of the dataset into train, validation, and test sets.
arXiv Detail & Related papers (2023-10-06T00:32:36Z) - Instant Multi-View Head Capture through Learnable Registration [62.70443641907766]
Existing methods for capturing datasets of 3D heads in dense semantic correspondence are slow.
We introduce TEMPEH to directly infer 3D heads in dense correspondence from calibrated multi-view images.
Predicting one head takes about 0.3 seconds with a median reconstruction error of 0.26 mm, 64% lower than the current state-of-the-art.
arXiv Detail & Related papers (2023-06-12T21:45:18Z) - CHSEL: Producing Diverse Plausible Pose Estimates from Contact and Free
Space Data [11.005988216563528]
We propose a novel method for estimating the set of plausible poses of a rigid object from a set of points with volumetric information.
Our approach has three key attributes: 1) It considers volumetric information, which allows us to account for known free space; 2) It uses a novel differentiable volumetric cost function to take advantage of powerful gradient-based optimization tools; and 3) It uses methods from the Quality Diversity (QD) literature to produce a diverse set of high-quality poses.
arXiv Detail & Related papers (2023-05-14T01:43:10Z) - A Simple Baseline for Direct 2D Multi-Person Head Pose Estimation with
Full-range Angles [24.04477340811483]
Existing head pose estimation (HPE) mainly focuses on single person with pre-detected frontal heads.
We argue that these single methods are fragile and inefficient for Multi-Person Head Pose Estimation (MPHPE)
In this paper, we focus on the full-range MPHPE problem, and propose a direct end-to-end simple baseline named DirectMHP.
arXiv Detail & Related papers (2023-02-02T14:08:49Z) - Learning 3D Human Pose Estimation from Dozens of Datasets using a
Geometry-Aware Autoencoder to Bridge Between Skeleton Formats [80.12253291709673]
We propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks.
Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model.
arXiv Detail & Related papers (2022-12-29T22:22:49Z) - Object detection and Autoencoder-based 6D pose estimation for highly
cluttered Bin Picking [14.076644545879939]
We propose a framework for pose estimation in highly cluttered scenes with small objects.
In this work, we compare synthetic data generation approaches for object detection and pose estimation.
We introduce a pose filtering algorithm that determines the most accurate estimated poses.
arXiv Detail & Related papers (2021-06-15T11:01:07Z) - WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose [1.8275108630751844]
We present an end-to-end head-pose estimation network designed to predict Euler angles through the full range head yaws from a single RGB image.
Our network builds on multi-loss approaches with changes to loss functions and training strategies adapted to wide range estimation.
arXiv Detail & Related papers (2020-05-20T20:53:01Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.