GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring
- URL: http://arxiv.org/abs/2512.07776v1
- Date: Mon, 08 Dec 2025 17:58:20 GMT
- Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring
- Authors: Maximilian Schall, Felix Leonard Knöfel, Noah Elias König, Jan Jonas Kubeler, Maximilian von Klinski, Joan Wilhelm Linnemann, Xiaoshi Liu, Iven Jelle Schlegelmilch, Ole Woyciniuk, Alexandra Schild, Dante Wasmuht, Magdalena Bermejo Espinet, German Illera Basas, Gerard de Melo,
- Abstract summary: Monitoring critically endangered western gorillas is currently hampered by the immense manual effort required to re-identify individuals from camera trap footage.<n>We introduce a benchmark with three novel datasets: Gorilla-SPAC-Wild, the largest video dataset for wild primate re-identification to date; Gorilla-supervised-Zoo, for assessing cross-domain re-identification generalization; and Gorilla-SPAC-MoT, for evaluating multi-object tracking in camera trap footage.<n>We present GorillaWatch, an end-to-end pipeline integrating detection tracking, and re-identification.
- Score: 39.4320036008364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monitoring critically endangered western lowland gorillas is currently hampered by the immense manual effort required to re-identify individuals from vast archives of camera trap footage. The primary obstacle to automating this process has been the lack of large-scale, "in-the-wild" video datasets suitable for training robust deep learning models. To address this gap, we introduce a comprehensive benchmark with three novel datasets: Gorilla-SPAC-Wild, the largest video dataset for wild primate re-identification to date; Gorilla-Berlin-Zoo, for assessing cross-domain re-identification generalization; and Gorilla-SPAC-MoT, for evaluating multi-object tracking in camera trap footage. Building on these datasets, we present GorillaWatch, an end-to-end pipeline integrating detection, tracking, and re-identification. To exploit temporal information, we introduce a multi-frame self-supervised pretraining strategy that leverages consistency in tracklets to learn domain-specific features without manual labels. To ensure scientific validity, a differentiable adaptation of AttnLRP verifies that our model relies on discriminative biometric traits rather than background correlations. Extensive benchmarking subsequently demonstrates that aggregating features from large-scale image backbones outperforms specialized video architectures. Finally, we address unsupervised population counting by integrating spatiotemporal constraints into standard clustering to mitigate over-segmentation. We publicly release all code and datasets to facilitate scalable, non-invasive monitoring of endangered species
Related papers
- Self-Supervised Animal Identification for Long Videos [0.8233028449337972]
We introduce a highly efficient, self-supervised method that reframes animal identification as a global clustering task.<n>Our framework matches or surpasses supervised baselines trained on over 1,000 labeled frames.<n>This work enables practical, high-accuracy animal identification on consumer-grade hardware.
arXiv Detail & Related papers (2026-01-14T17:53:59Z) - PriVi: Towards A General-Purpose Video Model For Primate Behavior In The Wild [50.656578456979496]
We introduce PriVi, a large-scale primate-centric video pretraining dataset.<n>We pretrain V-JEPA, a large-scale video model, on PriVi to learn primate-specific representations.<n>Results demonstrate that primate-centric pretraining substantially improves data efficiency and generalization.
arXiv Detail & Related papers (2025-11-12T19:27:40Z) - What You Have is What You Track: Adaptive and Robust Multimodal Tracking [72.92244578461869]
We present the first comprehensive study on tracker performance with temporally incomplete multimodal data.<n>Our model achieves SOTA performance across 9 benchmarks, excelling in both conventional complete and missing modality settings.
arXiv Detail & Related papers (2025-07-08T11:40:21Z) - Wildlife Target Re-Identification Using Self-supervised Learning in Non-Urban Settings [0.0]
Wildlife re-identification aims to match individuals of the same species across different observations.<n>Current state-of-the-art (SOTA) models rely on class labels to train supervised models for individual classification.<n>This study investigates self-supervised learning Self-Supervised Learning (SSL) for wildlife re-identification.
arXiv Detail & Related papers (2025-07-03T07:56:54Z) - Tracking Different Ant Species: An Unsupervised Domain Adaptation
Framework and a Dataset for Multi-object Tracking [6.0409040218619685]
We propose a data-driven multi-object tracker that employs domain adaptation to achieve the required generalisation.
We present a new dataset and a benchmark for the ant tracking problem.
arXiv Detail & Related papers (2023-01-25T13:00:16Z) - MBW: Multi-view Bootstrapping in the Wild [30.038254895713276]
Multi-camera systems that train fine-grained detectors have shown promise in detecting such errors.
The approach is based on calibrated cameras and rigid geometry, making it expensive, difficult to manage, and impractical in real-world scenarios.
In this paper, we address these bottlenecks by combining a non-rigid 3D neural prior with deep flow to obtain high-fidelity landmark estimates.
We are able to produce 2D results comparable to state-of-the-art fully supervised methods, along with 3D reconstructions that are impossible with other existing approaches.
arXiv Detail & Related papers (2022-10-04T16:27:54Z) - Dynamic Curriculum Learning for Great Ape Detection in the Wild [14.212559301656]
We propose an end-to-end curriculum learning approach to improve detector construction in real-world jungle environments.
In contrast to previous semi-supervised methods, our approach gradually improves detection quality by steering training towards self-reinforcement.
We show that such virtuous dynamics and controls can avoid learning collapse and gradually tie detector adjustments to higher model quality.
arXiv Detail & Related papers (2022-04-30T14:02:52Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Self-supervised Human Detection and Segmentation via Multi-view
Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training.
We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.