Related papers: Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data

Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data

URL: http://arxiv.org/abs/2406.14412v1
Date: Thu, 20 Jun 2024 15:33:39 GMT
Title: Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data
Authors: Moira Shooter, Charles Malleson, Adrian Hilton,
Abstract summary: We introduce a new benchmark analysis focusing on 3D canine pose estimation from monocular in-the-wild images. A multi-modal dataset 3DDogs-Lab was captured indoors, featuring various dog breeds trotting on a walkway. We create 3DDogs-Wild, a naturalised version of the dataset where the optical markers are in-painted and the subjects are placed in diverse environments. We show that using the 3DDogs-Wild to train the models leads to improved performance when evaluating on in-the-wild data.
Score: 17.042955091063444
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We introduce a new benchmark analysis focusing on 3D canine pose estimation from monocular in-the-wild images. A multi-modal dataset 3DDogs-Lab was captured indoors, featuring various dog breeds trotting on a walkway. It includes data from optical marker-based mocap systems, RGBD cameras, IMUs, and a pressure mat. While providing high-quality motion data, the presence of optical markers and limited background diversity make the captured video less representative of real-world conditions. To address this, we created 3DDogs-Wild, a naturalised version of the dataset where the optical markers are in-painted and the subjects are placed in diverse environments, enhancing its utility for training RGB image-based pose detectors. We show that using the 3DDogs-Wild to train the models leads to improved performance when evaluating on in-the-wild data. Additionally, we provide a thorough analysis using various pose estimation models, revealing their respective strengths and weaknesses. We believe that our findings, coupled with the datasets provided, offer valuable insights for advancing 3D animal pose estimation.

Related papers

Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting [64.64738535860351]
We present a scalable pipeline that converts single-view images into comprehensive, scale- and appearance-realistic 3D representations.<n>Our method bridges the gap between the vast repository of imagery and the increasing demand for spatial scene understanding.<n>By automatically generating authentic, scale-aware 3D data from images, we significantly reduce data collection costs and open new avenues for advancing spatial intelligence.
arXiv Detail & Related papers (2025-07-24T14:53:26Z)
L3D-Pose: Lifting Pose for 3D Avatars from a Single Camera in the Wild [15.174438063000453]
3D pose estimation provides a more comprehensive solution by incorporating depth, yet creating 3D pose datasets for animals is challenging due to their dynamic and unpredictable behaviours in natural settings. We propose a framework with systematically synthesized datasets for lifting poses from 2D to 3D and then utilize this to re-target motion from wild settings onto arbitrary avatars.
arXiv Detail & Related papers (2025-01-02T10:04:12Z)
Generative Zoo [41.65977386204797]
We introduce a pipeline that samples a diverse set of poses and shapes for a variety of mammalian quadrupeds and generates realistic images with corresponding ground-truth pose and shape parameters. We train a 3D pose and shape regressor on GenZoo, which achieves state-of-the-art performance on a real-world animal pose and shape estimation benchmark.
arXiv Detail & Related papers (2024-12-11T04:57:53Z)
Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects. We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z)
Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view. The dataset consists of 50k images and over 1.5M 3D objects in various scenes. We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z)
MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation. Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z)
Evaluation of deep lift pose models for 3D rodent pose estimation based on geometrically triangulated data [1.84316002191515]
Behavior is typically studied in terms of pose changes, which are ideally captured in three dimensions. This requires triangulation over a multi-camera system which view the animal from different angles. Here we propose the usage of lift-pose models that allow for robust 3D pose estimation of freely moving rodents from a single view camera view.
arXiv Detail & Related papers (2021-06-24T13:08:33Z)
Heuristic Weakly Supervised 3D Human Pose Estimation [13.82540778667711]
weakly supervised 3D human pose (HW-HuP) solution to estimate 3D poses in when no ground truth 3D pose data is available. We show that HW-HuP meaningfully improves upon state-of-the-art models in two practical settings where 3D pose data can hardly be obtained: human poses in bed, and infant poses in the wild.
arXiv Detail & Related papers (2021-05-23T18:40:29Z)
AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs in the Wild [51.35013619649463]
We present an extensive dataset of free-running cheetahs in the wild, called AcinoSet. The dataset contains 119,490 frames of multi-view synchronized high-speed video footage, camera calibration files and 7,588 human-annotated frames. The resulting 3D trajectories, human-checked 3D ground truth, and an interactive tool to inspect the data is also provided.
arXiv Detail & Related papers (2021-03-24T15:54:11Z)
SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera [97.0162841635425]
We present a solution to egocentric 3D body pose estimation from monocular images captured from downward looking fish-eye cameras installed on the rim of a head mounted VR device. This unusual viewpoint leads to images with unique visual appearance, with severe self-occlusions and perspective distortions. We propose an encoder-decoder architecture with a novel multi-branch decoder designed to account for the varying uncertainty in 2D predictions.
arXiv Detail & Related papers (2020-11-02T16:18:06Z)
Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution [34.301501457959056]
We propose a temporal regression network with a gated convolution module to transform 2D joints to 3D. A simple yet effective localization approach is also conducted to transform the normalized pose to the global trajectory. Our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods.
arXiv Detail & Related papers (2020-10-31T04:35:24Z)
RGBD-Dog: Predicting Canine Pose from RGBD Sensors [25.747221533627464]
We focus on the problem of 3D canine pose estimation from RGBD images. We generate a dataset of synthetic RGBD images from this data. A stacked hourglass network is trained to predict 3D joint locations.
arXiv Detail & Related papers (2020-04-16T17:34:45Z)
Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames. Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.