Economical Quaternion Extraction from a Human Skeletal Pose Estimate
using 2-D Cameras
- URL: http://arxiv.org/abs/2303.08657v3
- Date: Thu, 14 Sep 2023 04:26:01 GMT
- Title: Economical Quaternion Extraction from a Human Skeletal Pose Estimate
using 2-D Cameras
- Authors: Sriram Radhakrishna, Adithya Balasubramanyam
- Abstract summary: The proposed algorithm extracts a quaternion from a 2-D frame capturing an image of a human object at a sub-fifty millisecond latency.
The algorithm seeks to bypass the funding barrier and improve accessibility for robotics researchers involved in designing control systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a novel algorithm to extract a quaternion from a
two dimensional camera frame for estimating a contained human skeletal pose.
The problem of pose estimation is usually tackled through the usage of stereo
cameras and intertial measurement units for obtaining depth and euclidean
distance for measurement of points in 3D space. However, the usage of these
devices comes with a high signal processing latency as well as a significant
monetary cost. By making use of MediaPipe, a framework for building perception
pipelines for human pose estimation, the proposed algorithm extracts a
quaternion from a 2-D frame capturing an image of a human object at a sub-fifty
millisecond latency while also being capable of deployment at edges with a
single camera frame and a generally low computational resource availability,
especially for use cases involving last-minute detection and reaction by
autonomous robots. The algorithm seeks to bypass the funding barrier and
improve accessibility for robotics researchers involved in designing control
systems.
Related papers
- SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - MonoPIC -- A Monocular Low-Latency Pedestrian Intention Classification
Framework for IoT Edges Using ID3 Modelled Decision Trees [0.0]
We propose an algorithm that classifies the intent of a single arbitrarily chosen pedestrian in a two dimensional frame into logic states.
This bypasses the need to employ any relatively high latency deep-learning algorithms.
The model was able to achieve an average testing accuracy of 83.56% with a reliable variance of 0.0042 while operating with an average latency of 48 milliseconds.
arXiv Detail & Related papers (2023-04-01T02:42:24Z) - Scene-Aware 3D Multi-Human Motion Capture from a Single Camera [83.06768487435818]
We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
arXiv Detail & Related papers (2023-01-12T18:01:28Z) - P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose
Estimation [78.83305967085413]
This paper introduces a novel Pre-trained Spatial Temporal Many-to-One (P-STMO) model for 2D-to-3D human pose estimation task.
Our method outperforms state-of-the-art methods with fewer parameters and less computational overhead.
arXiv Detail & Related papers (2022-03-15T04:00:59Z) - CNN-based Omnidirectional Object Detection for HermesBot Autonomous
Delivery Robot with Preliminary Frame Classification [53.56290185900837]
We propose an algorithm for optimizing a neural network for object detection using preliminary binary frame classification.
An autonomous mobile robot with 6 rolling-shutter cameras on the perimeter providing a 360-degree field of view was used as the experimental setup.
arXiv Detail & Related papers (2021-10-22T15:05:37Z) - Real-time, low-cost multi-person 3D pose estimation [8.093696116585717]
Three-dimensional pose estimation traditionally requires advanced equipment, such as multiple linked intensity cameras or high-resolution time-of-flight cameras to produce depth images.
Here, we demonstrate that computational imaging methods can achieve accurate pose estimation and overcome the apparent limitations of time-of-flight sensors designed for much simpler tasks.
This work opens up promising real-life applications in scenarios that were previously restricted by the advanced hardware requirements and cost of time-of-flight technology.
arXiv Detail & Related papers (2021-10-11T12:42:00Z) - Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose
Estimation [61.98690211671168]
We propose a Multi-level Attention-Decoder Network (MAED) to model multi-level attentions in a unified framework.
With the training set of 3DPW, MAED outperforms previous state-of-the-art methods by 6.2, 7.2, and 2.4 mm of PA-MPJPE.
arXiv Detail & Related papers (2021-09-06T09:06:17Z) - Real-Time Human Pose Estimation on a Smart Walker using Convolutional
Neural Networks [4.076099054649463]
We present a novel approach to patient monitoring and data-driven human-in-the-loop control in the context of smart walkers.
It is able to extract a complete and compact body representation in real-time and from inexpensive sensors.
Despite promising results, more data should be collected on users with impairments to assess its performance as a rehabilitation tool in real-world scenarios.
arXiv Detail & Related papers (2021-06-28T14:11:48Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Graph and Temporal Convolutional Networks for 3D Multi-person Pose
Estimation in Monocular Videos [33.974241749058585]
We propose a novel framework integrating graph convolutional networks (GCNs) and temporal convolutional networks (TCNs) to robustly estimate camera-centric multi-person 3D poses.
In particular, we introduce a human-joint GCN, which employs the 2D pose estimator's confidence scores to improve the pose estimation results.
The two GCNs work together to estimate the spatial frame-wise 3D poses and can make use of both visible joint and bone information in the target frame to estimate the occluded or missing human-part information.
arXiv Detail & Related papers (2020-12-22T03:01:19Z) - Three-dimensional Human Tracking of a Mobile Robot by Fusion of Tracking
Results of Two Cameras [0.860255319568951]
OpenPose is used for human detection.
A new stereo vision framework is proposed to cope with the problems.
The effectiveness of the proposed framework and the method is verified through target-tracking experiments.
arXiv Detail & Related papers (2020-07-03T06:46:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.