3D-POP -- An automated annotation approach to facilitate markerless
2D-3D tracking of freely moving birds with marker-based motion capture
- URL: http://arxiv.org/abs/2303.13174v1
- Date: Thu, 23 Mar 2023 11:03:18 GMT
- Title: 3D-POP -- An automated annotation approach to facilitate markerless
2D-3D tracking of freely moving birds with marker-based motion capture
- Authors: Hemal Naik, Alex Hoi Hang Chan, Junran Yang, Mathilde Delacoux, Iain
D. Couzin, Fumihiro Kano, M\'at\'e Nagy
- Abstract summary: We propose a method that uses a motion capture (mo-cap) system to obtain a large amount of annotated data on animal movement and posture in a semi-automatic manner.
Our method is novel in that it extracts the 3D positions of morphological keypoints in reference to the positions of markers attached to the animals.
Using this method, we obtained, and offer here, a new dataset - 3D-POP with approximately 300k annotated frames (4 million instances) in the form of videos.
- Score: 1.1083289076967897
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advances in machine learning and computer vision are revolutionizing
the field of animal behavior by enabling researchers to track the poses and
locations of freely moving animals without any marker attachment. However,
large datasets of annotated images of animals for markerless pose tracking,
especially high-resolution images taken from multiple angles with accurate 3D
annotations, are still scant. Here, we propose a method that uses a motion
capture (mo-cap) system to obtain a large amount of annotated data on animal
movement and posture (2D and 3D) in a semi-automatic manner. Our method is
novel in that it extracts the 3D positions of morphological keypoints (e.g
eyes, beak, tail) in reference to the positions of markers attached to the
animals. Using this method, we obtained, and offer here, a new dataset - 3D-POP
with approximately 300k annotated frames (4 million instances) in the form of
videos having groups of one to ten freely moving birds from 4 different camera
views in a 3.6m x 4.2m area. 3D-POP is the first dataset of flocking birds with
accurate keypoint annotations in 2D and 3D along with bounding box and
individual identities and will facilitate the development of solutions for
problems of 2D to 3D markerless pose, trajectory tracking, and identification
in birds.
Related papers
- Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data [17.042955091063444]
We introduce a new benchmark analysis focusing on 3D canine pose estimation from monocular in-the-wild images.
A multi-modal dataset 3DDogs-Lab was captured indoors, featuring various dog breeds trotting on a walkway.
We create 3DDogs-Wild, a naturalised version of the dataset where the optical markers are in-painted and the subjects are placed in diverse environments.
We show that using the 3DDogs-Wild to train the models leads to improved performance when evaluating on in-the-wild data.
arXiv Detail & Related papers (2024-06-20T15:33:39Z) - SpatialTracker: Tracking Any 2D Pixels in 3D Space [71.58016288648447]
We propose to estimate point trajectories in 3D space to mitigate the issues caused by image projection.
Our method, named SpatialTracker, lifts 2D pixels to 3D using monocular depth estimators.
Tracking in 3D allows us to leverage as-rigid-as-possible (ARAP) constraints while simultaneously learning a rigidity embedding that clusters pixels into different rigid parts.
arXiv Detail & Related papers (2024-04-05T17:59:25Z) - 3D-MuPPET: 3D Multi-Pigeon Pose Estimation and Tracking [14.52333427647304]
We present 3D-MuPPET, a framework to estimate and track 3D poses of up to 10 pigeons at interactive speed using multiple camera views.
For identity matching, we first dynamically match 2D detections to global identities in the first frame, then use a 2D tracker to maintain IDs across views in subsequent frames.
We show that 3D-MuPPET also works in outdoors without additional annotations from natural environments.
arXiv Detail & Related papers (2023-08-29T14:02:27Z) - LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D
Part Discovery [72.3681707384754]
We propose a practical problem setting to estimate 3D pose and shape of animals given only a few in-the-wild images of a particular animal species.
We do not assume any form of 2D or 3D ground-truth annotations, nor do we leverage any multi-view or temporal information.
Following these insights, we propose LASSIE, a novel optimization framework which discovers 3D parts in a self-supervised manner.
arXiv Detail & Related papers (2022-07-07T17:00:07Z) - Gait Recognition in the Wild with Dense 3D Representations and A
Benchmark [86.68648536257588]
Existing studies for gait recognition are dominated by 2D representations like the silhouette or skeleton of the human body in constrained scenes.
This paper aims to explore dense 3D representations for gait recognition in the wild.
We build the first large-scale 3D representation-based gait recognition dataset, named Gait3D.
arXiv Detail & Related papers (2022-04-06T03:54:06Z) - DOVE: Learning Deformable 3D Objects by Watching Videos [89.43105063468077]
We present DOVE, which learns to predict 3D canonical shape, deformation, viewpoint and texture from a single 2D image of a bird.
Our method reconstructs temporally consistent 3D shape and deformation, which allows us to animate and re-render the bird from arbitrary viewpoints.
arXiv Detail & Related papers (2021-07-22T17:58:10Z) - AcinoSet: A 3D Pose Estimation Dataset and Baseline Models for Cheetahs
in the Wild [51.35013619649463]
We present an extensive dataset of free-running cheetahs in the wild, called AcinoSet.
The dataset contains 119,490 frames of multi-view synchronized high-speed video footage, camera calibration files and 7,588 human-annotated frames.
The resulting 3D trajectories, human-checked 3D ground truth, and an interactive tool to inspect the data is also provided.
arXiv Detail & Related papers (2021-03-24T15:54:11Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated
Convolution [34.301501457959056]
We propose a temporal regression network with a gated convolution module to transform 2D joints to 3D.
A simple yet effective localization approach is also conducted to transform the normalized pose to the global trajectory.
Our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods.
arXiv Detail & Related papers (2020-10-31T04:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.