Deep in the Jungle: Towards Automating Chimpanzee Population Estimation
- URL: http://arxiv.org/abs/2601.22917v1
- Date: Fri, 30 Jan 2026 12:40:47 GMT
- Title: Deep in the Jungle: Towards Automating Chimpanzee Population Estimation
- Authors: Tom Raynes, Otto Brookes, Timm Haucke, Lukas Bösch, Anne-Sophie Crunchant, Hjalmar Kühl, Sara Beery, Majid Mirmehdi, Tilo Burghardt,
- Abstract summary: estimation of abundance and density in unmarked populations of great apes relies on statistical frameworks that require animal-to-camera distance measurements.<n>This study introduces and evaluates an only sparsely explored alternative: the integration of computer vision-based monocular depth estimation pipelines directly into ecological camera trap distances for great ape conservation.<n>Using a real-world dataset of 220 camera trap videos documenting a wild chimpanzee population, we combine two MDE models, Dense Prediction Transformers and Depth Anything, with multiple distance sampling strategies.<n>The proposed approach yields population estimates within 22% of those obtained using traditional methods.
- Score: 8.705217788065593
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The estimation of abundance and density in unmarked populations of great apes relies on statistical frameworks that require animal-to-camera distance measurements. In practice, acquiring these distances depends on labour-intensive manual interpretation of animal observations across large camera trap video corpora. This study introduces and evaluates an only sparsely explored alternative: the integration of computer vision-based monocular depth estimation (MDE) pipelines directly into ecological camera trap workflows for great ape conservation. Using a real-world dataset of 220 camera trap videos documenting a wild chimpanzee population, we combine two MDE models, Dense Prediction Transformers and Depth Anything, with multiple distance sampling strategies. These components are used to generate detection distance estimates, from which population density and abundance are inferred. Comparative analysis against manually derived ground-truth distances shows that calibrated DPT consistently outperforms Depth Anything. This advantage is observed in both distance estimation accuracy and downstream density and abundance inference. Nevertheless, both models exhibit systematic biases. We show that, given complex forest environments, they tend to overestimate detection distances and consequently underestimate density and abundance relative to conventional manual approaches. We further find that failures in animal detection across distance ranges are a primary factor limiting estimation accuracy. Overall, this work provides a case study that shows MDE-driven camera trap distance sampling is a viable and practical alternative to manual distance estimation. The proposed approach yields population estimates within 22% of those obtained using traditional methods.
Related papers
- Benchmark on Monocular Metric Depth Estimation in Wildlife Setting [5.296470528744146]
This work introduces the first benchmark for monocular metric depth estimation in wildlife monitoring conditions.<n>We evaluate four state-of-the-art MDE methods (Depth Anything V2, ML Depth Pro, ZoeDepth, and Metric3D) alongside a geometric baseline on 93 camera trap images.<n>Our results demonstrate that Depth Anything V2 achieves the best overall performance with a mean absolute error of 0.454m and correlation of 0.962.
arXiv Detail & Related papers (2025-10-06T11:43:34Z) - Bio-inspired visual relative localization for large swarms of UAVs [3.9421388043218655]
We propose a new approach to visual perception for relative localization of agents within large-scale swarms of UAVs.<n>Inspired by biological perception utilized by schools of sardines, swarms of bees, and other large groups of animals capable of moving in a decentralized yet coherent manner, our method does not rely on detecting individual neighbors by each agent and estimating their relative position.<n>A novel swarm control algorithm is proposed to make it compatible with the new relative localization method.
arXiv Detail & Related papers (2024-12-03T11:47:14Z) - SRPose: Two-view Relative Pose Estimation with Sparse Keypoints [51.49105161103385]
SRPose is a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios.
It achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed.
It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
arXiv Detail & Related papers (2024-07-11T05:46:35Z) - Evaluating Perceptual Distance Models by Fitting Binomial Distributions to Two-Alternative Forced Choice Data [43.714290271351466]
This paper introduces a more robust distance-model evaluation method using a pure probabilistic approach, applying maximum likelihood estimation to a binomial decision model.<n>Our method demonstrates superior simplicity, interpretability, flexibility, and computational efficiency, as shown through evaluations of various visual distance models on two 2AFC PF datasets.
arXiv Detail & Related papers (2024-03-15T15:21:04Z) - Of Mice and Pose: 2D Mouse Pose Estimation from Unlabelled Data and
Synthetic Prior [0.7499722271664145]
We propose an approach for estimating 2D mouse body pose from unlabelled images using a synthetically generated empirical pose prior.
We adapt this method to the limb structure of the mouse and generate the empirical prior of 2D poses from a synthetic 3D mouse model.
In experiments on a new mouse video dataset, we evaluate the performance of the approach by comparing pose predictions to a manually obtained ground truth.
arXiv Detail & Related papers (2023-07-25T09:31:55Z) - Rethinking pose estimation in crowds: overcoming the detection
information-bottleneck and ambiguity [46.10812760258666]
Frequent interactions between individuals are a fundamental challenge for pose estimation algorithms.
We propose a novel pipeline called bottom-up conditioned top-down pose estimation.
We demonstrate the performance and efficiency of our approach on animal and human pose estimation benchmarks.
arXiv Detail & Related papers (2023-06-13T16:14:40Z) - Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale
Persons [75.86463396561744]
In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all persons.
Our method achieves 38.4% improvement on bounding box precision and 39.1% improvement on bounding box recall over the state of the art (SOTA)
For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing.
arXiv Detail & Related papers (2022-08-25T10:09:10Z) - Direct Dense Pose Estimation [138.56533828316833]
Dense human pose estimation is the problem of learning dense correspondences between RGB images and the surfaces of human bodies.
Prior dense pose estimation methods are all based on Mask R-CNN framework and operate in a top-down manner of first attempting to identify a bounding box for each person.
We propose a novel alternative method for solving the dense pose estimation problem, called Direct Dense Pose (DDP)
arXiv Detail & Related papers (2022-04-04T06:14:38Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Distance Estimation and Animal Tracking for Wildlife Camera Trapping [0.0]
We propose a fully automatic approach to estimate camera-to-animal distances.
We leverage state-of-the-art relative MDE and a novel alignment procedure to estimate metric distances.
We achieve a mean absolute distance estimation error of only 0.9864 meters at a precision of 90.3% and recall of 63.8%.
arXiv Detail & Related papers (2022-02-09T18:12:18Z) - Overcoming the Distance Estimation Bottleneck in Camera Trap Distance
Sampling [0.0]
Estimating animal abundance is of critical importance to assess, for example, the consequences of land-use change and invasive species on species composition.
This study proposes a completely automatized workflow utilizing state-of-the-art methods of image processing and pattern recognition.
arXiv Detail & Related papers (2021-05-10T10:17:34Z) - Automatic Social Distance Estimation From Images: Performance
Evaluation, Test Benchmark, and Algorithm [78.88882860340797]
COVID-19 virus has caused a global pandemic since March 2020.
Maintaining a minimum of one meter distance from other people is strongly suggested to reduce the risk of infection.
There is no suitable test benchmark for such algorithms.
arXiv Detail & Related papers (2021-03-11T16:15:20Z) - Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View
Geometry [62.29762409558553]
Epipolar constraints are at the core of feature matching and depth estimation in multi-person 3D human pose estimation methods.
Despite the satisfactory performance of this formulation in sparser crowd scenes, its effectiveness is frequently challenged under denser crowd circumstances.
In this paper, we depart from the multi-person 3D pose estimation formulation, and instead reformulate it as crowd pose estimation.
arXiv Detail & Related papers (2020-07-21T17:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.