3D-MuPPET: 3D Multi-Pigeon Pose Estimation and Tracking
- URL: http://arxiv.org/abs/2308.15316v3
- Date: Fri, 15 Dec 2023 14:40:00 GMT
- Title: 3D-MuPPET: 3D Multi-Pigeon Pose Estimation and Tracking
- Authors: Urs Waldmann, Alex Hoi Hang Chan, Hemal Naik, M\'at\'e Nagy, Iain D.
Couzin, Oliver Deussen, Bastian Goldluecke, Fumihiro Kano
- Abstract summary: We present 3D-MuPPET, a framework to estimate and track 3D poses of up to 10 pigeons at interactive speed using multiple camera views.
For identity matching, we first dynamically match 2D detections to global identities in the first frame, then use a 2D tracker to maintain IDs across views in subsequent frames.
We show that 3D-MuPPET also works in outdoors without additional annotations from natural environments.
- Score: 14.52333427647304
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Markerless methods for animal posture tracking have been rapidly developing
recently, but frameworks and benchmarks for tracking large animal groups in 3D
are still lacking. To overcome this gap in the literature, we present
3D-MuPPET, a framework to estimate and track 3D poses of up to 10 pigeons at
interactive speed using multiple camera views. We train a pose estimator to
infer 2D keypoints and bounding boxes of multiple pigeons, then triangulate the
keypoints to 3D. For identity matching of individuals in all views, we first
dynamically match 2D detections to global identities in the first frame, then
use a 2D tracker to maintain IDs across views in subsequent frames. We achieve
comparable accuracy to a state of the art 3D pose estimator in terms of median
error and Percentage of Correct Keypoints. Additionally, we benchmark the
inference speed of 3D-MuPPET, with up to 9.45 fps in 2D and 1.89 fps in 3D, and
perform quantitative tracking evaluation, which yields encouraging results.
Finally, we showcase two novel applications for 3D-MuPPET. First, we train a
model with data of single pigeons and achieve comparable results in 2D and 3D
posture estimation for up to 5 pigeons. Second, we show that 3D-MuPPET also
works in outdoors without additional annotations from natural environments.
Both use cases simplify the domain shift to new species and environments,
largely reducing annotation effort needed for 3D posture tracking. To the best
of our knowledge we are the first to present a framework for 2D/3D animal
posture and trajectory tracking that works in both indoor and outdoor
environments for up to 10 individuals. We hope that the framework can open up
new opportunities in studying animal collective behaviour and encourages
further developments in 3D multi-animal posture tracking.
Related papers
- TAPVid-3D: A Benchmark for Tracking Any Point in 3D [63.060421798990845]
We introduce a new benchmark, TAPVid-3D, for evaluating the task of Tracking Any Point in 3D.
This benchmark will serve as a guidepost to improve our ability to understand precise 3D motion and surface deformation from monocular video.
arXiv Detail & Related papers (2024-07-08T13:28:47Z) - SpatialTracker: Tracking Any 2D Pixels in 3D Space [71.58016288648447]
We propose to estimate point trajectories in 3D space to mitigate the issues caused by image projection.
Our method, named SpatialTracker, lifts 2D pixels to 3D using monocular depth estimators.
Tracking in 3D allows us to leverage as-rigid-as-possible (ARAP) constraints while simultaneously learning a rigidity embedding that clusters pixels into different rigid parts.
arXiv Detail & Related papers (2024-04-05T17:59:25Z) - 3D-POP -- An automated annotation approach to facilitate markerless
2D-3D tracking of freely moving birds with marker-based motion capture [1.1083289076967897]
We propose a method that uses a motion capture (mo-cap) system to obtain a large amount of annotated data on animal movement and posture in a semi-automatic manner.
Our method is novel in that it extracts the 3D positions of morphological keypoints in reference to the positions of markers attached to the animals.
Using this method, we obtained, and offer here, a new dataset - 3D-POP with approximately 300k annotated frames (4 million instances) in the form of videos.
arXiv Detail & Related papers (2023-03-23T11:03:18Z) - CameraPose: Weakly-Supervised Monocular 3D Human Pose Estimation by
Leveraging In-the-wild 2D Annotations [25.05308239278207]
We present CameraPose, a weakly-supervised framework for 3D human pose estimation from a single image.
By adding a camera parameter branch, any in-the-wild 2D annotations can be fed into our pipeline to boost the training diversity.
We also introduce a refinement network module with confidence-guided loss to further improve the quality of noisy 2D keypoints extracted by 2D pose estimators.
arXiv Detail & Related papers (2023-01-08T05:07:41Z) - Gait Recognition in the Wild with Dense 3D Representations and A
Benchmark [86.68648536257588]
Existing studies for gait recognition are dominated by 2D representations like the silhouette or skeleton of the human body in constrained scenes.
This paper aims to explore dense 3D representations for gait recognition in the wild.
We build the first large-scale 3D representation-based gait recognition dataset, named Gait3D.
arXiv Detail & Related papers (2022-04-06T03:54:06Z) - AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs
in the Wild [51.35013619649463]
We present an extensive dataset of free-running cheetahs in the wild, called AcinoSet.
The dataset contains 119,490 frames of multi-view synchronized high-speed video footage, camera calibration files and 7,588 human-annotated frames.
The resulting 3D trajectories, human-checked 3D ground truth, and an interactive tool to inspect the data is also provided.
arXiv Detail & Related papers (2021-03-24T15:54:11Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - F-Siamese Tracker: A Frustum-based Double Siamese Network for 3D Single
Object Tracking [12.644452175343059]
A main challenge in 3D single object tracking is how to reduce search space for generating appropriate 3D candidates.
Instead of relying on 3D proposals, we produce 2D region proposals which are then extruded into 3D viewing frustums.
We perform an online accuracy validation on the 3D frustum to generate refined point cloud searching space.
arXiv Detail & Related papers (2020-10-22T08:01:17Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.