Related papers: Weakly supervised framework for wildlife detection and counting in challenging Arctic environments: a case study on caribou (Rangifer tarandus)

Weakly supervised framework for wildlife detection and counting in challenging Arctic environments: a case study on caribou (Rangifer tarandus)

URL: http://arxiv.org/abs/2601.18891v2
Date: Wed, 28 Jan 2026 18:12:45 GMT
Title: Weakly supervised framework for wildlife detection and counting in challenging Arctic environments: a case study on caribou (Rangifer tarandus)
Authors: Ghazaleh Serati, Samuel Foucher, Jerome Theau,
Abstract summary: We propose a weakly supervised patch-level pretraining based on a detection network's architecture.<n>This dataset includes five caribou herds distributed across Alaska.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Caribou across the Arctic has declined in recent decades, motivating scalable and accurate monitoring approaches to guide evidence-based conservation actions and policy decisions. Manual interpretation from this imagery is labor-intensive and error-prone, underscoring the need for automatic and reliable detection across varying scenes. Yet, such automatic detection is challenging due to severe background heterogeneity, dominant empty terrain (class imbalance), small or occluded targets, and wide variation in density and scale. To make the detection model (HerdNet) more robust to these challenges, a weakly supervised patch-level pretraining based on a detection network's architecture is proposed. The detection dataset includes five caribou herds distributed across Alaska. By learning from empty vs. non-empty labels in this dataset, the approach produces early weakly supervised knowledge for enhanced detection compared to HerdNet, which is initialized from generic weights. Accordingly, the patch-based pretrain network attained high accuracy on multi-herd imagery (2017) and on an independent year's (2019) test sets (F1: 93.7%/92.6%, respectively), enabling reliable mapping of regions containing animals to facilitate manual counting on large aerial imagery. Transferred to detection, initialization from weakly supervised pretraining yielded consistent gains over ImageNet weights on both positive patches (F1: 92.6%/93.5% vs. 89.3%/88.6%), and full-image counting (F1: 95.5%/93.3% vs. 91.5%/90.4%). Remaining limitations are false positives from animal-like background clutter and false negatives related to low animal density occlusions. Overall, pretraining on coarse labels prior to detection makes it possible to rely on weakly-supervised pretrained weights even when labeled data are limited, achieving results comparable to generic-weight initialization.

Related papers

Automated Re-Identification of Holstein-Friesian Cattle in Dense Crowds [2.3843187053931456]
We propose a new detect-segment-identify pipeline that leverages the Open-Vocabulary Weight-free Localisation and the Segment Anything models.<n>Our methodology overcomes detection breakdown in dense animal groupings, resulting in a 98.93% accuracy.<n>We show that unsupervised contrastive learning can build on this to yield 94.82% Re-ID accuracy on our test data.
arXiv Detail & Related papers (2026-02-17T19:25:50Z)
When Language Model Guides Vision: Grounding DINO for Cattle Muzzle Detection [0.48429188360918735]
Grounding DINO is a vision-language model capable of detecting muzzles without task-specific training or annotated data.<n>Our model achieves a mean Average Precision (mAP)@0.5 of 76.8%, demonstrating promising performance without requiring annotated data.
arXiv Detail & Related papers (2025-09-08T08:21:34Z)
Generative Edge Detection with Stable Diffusion [52.870631376660924]
Edge detection is typically viewed as a pixel-level classification problem mainly addressed by discriminative methods. We propose a novel approach, named Generative Edge Detector (GED), by fully utilizing the potential of the pre-trained stable diffusion model. We conduct extensive experiments on multiple datasets and achieve competitive performance.
arXiv Detail & Related papers (2024-10-04T01:52:23Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [81.93945602120453]
We introduce an approach that is both general and parameter-efficient for face forgery detection.<n>We design a forgery-style mixture formulation that augments the diversity of forgery source domains.<n>We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery [2.242884292006914]
This paper addresses the problem of bootstrapping such a rare object detection task. We propose novel offline and online cluster-based approaches for sampling patches. We apply our methods for identifying bomas, or small enclosures for herd animals, in the Serengeti Mara region of Kenya and Tanzania.
arXiv Detail & Related papers (2024-03-05T07:44:13Z)
Locate and Verify: A Two-Stream Network for Improved Deepfake Detection [33.50963446256726]
Current deepfake detection methods are typically inadequate in generalizability. We propose an innovative two-stream network that effectively enlarges the potential regions from which the model extracts evidence. We also propose a Semi-supervised Patch Similarity Learning strategy to estimate patch-level forged location annotations.
arXiv Detail & Related papers (2023-09-20T08:25:19Z)
Augment and Criticize: Exploring Informative Samples for Semi-Supervised Monocular 3D Object Detection [64.65563422852568]
We improve the challenging monocular 3D object detection problem with a general semi-supervised framework. We introduce a novel, simple, yet effective Augment and Criticize' framework that explores abundant informative samples from unlabeled data. The two new detectors, dubbed 3DSeMo_DLE and 3DSeMo_FLEX, achieve state-of-the-art results with remarkable improvements for over 3.5% AP_3D/BEV (Easy) on KITTI.
arXiv Detail & Related papers (2023-03-20T16:28:15Z)
ReDFeat: Recoupling Detection and Description for Multimodal Feature Learning [51.07496081296863]
We recouple independent constraints of detection and description of multimodal feature learning with a mutual weighting strategy. We propose a detector that possesses a large receptive field and is equipped with learnable non-maximum suppression layers. We build a benchmark that contains cross visible, infrared, near-infrared and synthetic aperture radar image pairs for evaluating the performance of features in feature matching and image registration tasks.
arXiv Detail & Related papers (2022-05-16T04:24:22Z)
Efficient remedies for outlier detection with variational autoencoders [8.80692072928023]
Likelihoods computed by deep generative models are a candidate metric for outlier detection with unlabeled data. We show that a theoretically-grounded correction readily ameliorates a key bias with VAE likelihood estimates. We also show that the variance of the likelihoods computed over an ensemble of VAEs also enables robust outlier detection.
arXiv Detail & Related papers (2021-08-19T16:00:58Z)
DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA) Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution. Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z)
Learning a Unified Sample Weighting Network for Object Detection [113.98404690619982]
Region sampling or weighting is significantly important to the success of modern region-based object detectors. We argue that sample weighting should be data-dependent and task-dependent. We propose a unified sample weighting network to predict a sample's task weights.
arXiv Detail & Related papers (2020-06-11T16:19:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.