Related papers: Self-supervised Learning on Camera Trap Footage Yields a Strong Universal Face Embedder

Self-supervised Learning on Camera Trap Footage Yields a Strong Universal Face Embedder

URL: http://arxiv.org/abs/2507.10552v1
Date: Mon, 14 Jul 2025 17:59:59 GMT
Title: Self-supervised Learning on Camera Trap Footage Yields a Strong Universal Face Embedder
Authors: Vladimir Iashin, Horace Lee, Dan Schofield, Andrew Zisserman,
Abstract summary: This study introduces a fully self-supervised approach to learning robust chimpanzee face embeddings from unlabeled camera-trap footage.<n>We train Vision Transformers on automatically mined face crops, eliminating the need for identity labels.<n>This work underscores the potential of self-supervised learning in biodiversity monitoring and paves the way for scalable, non-invasive population studies.
Score: 48.03572115000886
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Camera traps are revolutionising wildlife monitoring by capturing vast amounts of visual data; however, the manual identification of individual animals remains a significant bottleneck. This study introduces a fully self-supervised approach to learning robust chimpanzee face embeddings from unlabeled camera-trap footage. Leveraging the DINOv2 framework, we train Vision Transformers on automatically mined face crops, eliminating the need for identity labels. Our method demonstrates strong open-set re-identification performance, surpassing supervised baselines on challenging benchmarks such as Bossou, despite utilising no labelled data during training. This work underscores the potential of self-supervised learning in biodiversity monitoring and paves the way for scalable, non-invasive population studies.

Related papers

Wildlife Target Re-Identification Using Self-supervised Learning in Non-Urban Settings [0.0]
Wildlife re-identification aims to match individuals of the same species across different observations.<n>Current state-of-the-art (SOTA) models rely on class labels to train supervised models for individual classification.<n>This study investigates self-supervised learning Self-Supervised Learning (SSL) for wildlife re-identification.
arXiv Detail & Related papers (2025-07-03T07:56:54Z)
Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images [57.96659470133514]
Motion-activated camera traps constitute an efficient tool for tracking and monitoring wildlife populations across the globe. Supervised learning techniques have been successfully deployed to analyze such imagery, however training such techniques requires annotations from experts. Reducing the reliance on costly labelled data has immense potential in developing large-scale wildlife tracking solutions with markedly less human labor.
arXiv Detail & Related papers (2023-11-02T08:32:00Z)
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks [64.67735676127208]
Text-to-image diffusion models have shown great potential for benefiting image recognition. Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images. We introduce customized solutions by fully exploiting the aforementioned free attention masks.
arXiv Detail & Related papers (2023-08-13T10:07:46Z)
Unsupervised Learning of Accurate Siamese Tracking [68.58171095173056]
We present a novel unsupervised tracking framework, in which we can learn temporal correspondence both on the classification branch and regression branch. Our tracker outperforms preceding unsupervised methods by a substantial margin, performing on par with supervised methods on large-scale datasets such as TrackingNet and LaSOT.
arXiv Detail & Related papers (2022-04-04T13:39:43Z)
WhoAmI: An Automatic Tool for Visual Recognition of Tiger and Leopard Individuals in the Wild [3.1708876837195157]
We develop automatic algorithms that are able to detect animals, identify the species of animals and to recognize individual animals for two species. We demonstrate the effectiveness of our approach on a data set of camera-trap images recorded in the jungles of Southern India.
arXiv Detail & Related papers (2020-06-17T16:17:46Z)
Visual Identification of Individual Holstein-Friesian Cattle via Deep Metric Learning [8.784100314325395]
Holstein-Friesian cattle exhibit individually-characteristic black and white coat patterns visually akin to those arising from Turing's reaction-diffusion systems. This work takes advantage of these natural markings in order to automate visual detection and biometric identification of individual Holstein-Friesians via convolutional neural networks and deep metric learning techniques.
arXiv Detail & Related papers (2020-06-16T14:41:55Z)
Automatic Detection and Recognition of Individuals in Patterned Species [4.163860911052052]
We develop a framework for automatic detection and recognition of individuals in different patterned species. We use the recently proposed Faster-RCNN object detection framework to efficiently detect animals in images. We evaluate our recognition system on zebra and jaguar images to show generalization to other patterned species.
arXiv Detail & Related papers (2020-05-06T15:29:21Z)
Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation [93.83369981759996]
We propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap. Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation. We propose consistency regularization on predicted CAMs from various transformed images to provide self-supervision for network learning.
arXiv Detail & Related papers (2020-04-09T14:57:57Z)
Automatic image-based identification and biomass estimation of invertebrates [70.08255822611812]
Time-consuming sorting and identification of taxa pose strong limitations on how many insect samples can be processed. We propose to replace the standard manual approach of human expert-based sorting and identification with an automatic image-based technology. We use state-of-the-art Resnet-50 and InceptionV3 CNNs for the classification task.
arXiv Detail & Related papers (2020-02-05T21:38:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.