Beluga Whale Detection from Satellite Imagery with Point Labels
- URL: http://arxiv.org/abs/2505.12066v1
- Date: Sat, 17 May 2025 16:13:10 GMT
- Title: Beluga Whale Detection from Satellite Imagery with Point Labels
- Authors: Yijie Zheng, Jinxuan Yang, Yu Chen, Yaxuan Wang, Yihang Lu, Guoqing Li,
- Abstract summary: This study introduces an automated pipeline for detecting beluga whales and harp seals in VHR satellite imagery.<n>The pipeline leverages point annotations and the Segment Anything Model (SAM) to generate precise bounding box annotations.<n>YOLOv8 trained on SAM-labeled boxes achieved an overall $textF_text1$-score of 72.2% for whales overall and 70.3% for harp seals.
- Score: 8.461883879383517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Very high-resolution (VHR) satellite imagery has emerged as a powerful tool for monitoring marine animals on a large scale. However, existing deep learning-based whale detection methods usually require manually created, high-quality bounding box annotations, which are labor-intensive to produce. Moreover, existing studies often exclude ``uncertain whales'', individuals that have ambiguous appearances in satellite imagery, limiting the applicability of these models in real-world scenarios. To address these limitations, this study introduces an automated pipeline for detecting beluga whales and harp seals in VHR satellite imagery. The pipeline leverages point annotations and the Segment Anything Model (SAM) to generate precise bounding box annotations, which are used to train YOLOv8 for multiclass detection of certain whales, uncertain whales, and harp seals. Experimental results demonstrated that SAM-generated annotations significantly improved detection performance, achieving higher $\text{F}_\text{1}$-scores compared to traditional buffer-based annotations. YOLOv8 trained on SAM-labeled boxes achieved an overall $\text{F}_\text{1}$-score of 72.2% for whales overall and 70.3% for harp seals, with superior performance in dense scenes. The proposed approach not only reduces the manual effort required for annotation but also enhances the detection of uncertain whales, offering a more comprehensive solution for marine animal monitoring. This method holds great potential for extending to other species, habitats, and remote sensing platforms, as well as for estimating whale biometrics, thereby advancing ecological monitoring and conservation efforts. The codes for our label and detection pipeline are publicly available at http://github.com/voyagerxvoyagerx/beluga-seeker .
Related papers
- Automated Re-Identification of Holstein-Friesian Cattle in Dense Crowds [2.3843187053931456]
We propose a new detect-segment-identify pipeline that leverages the Open-Vocabulary Weight-free Localisation and the Segment Anything models.<n>Our methodology overcomes detection breakdown in dense animal groupings, resulting in a 98.93% accuracy.<n>We show that unsupervised contrastive learning can build on this to yield 94.82% Re-ID accuracy on our test data.
arXiv Detail & Related papers (2026-02-17T19:25:50Z) - Weakly supervised framework for wildlife detection and counting in challenging Arctic environments: a case study on caribou (Rangifer tarandus) [0.0]
We propose a weakly supervised patch-level pretraining based on a detection network's architecture.<n>This dataset includes five caribou herds distributed across Alaska.
arXiv Detail & Related papers (2026-01-26T19:02:18Z) - Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery [6.166882357769285]
We present a semi-automated approach for surfacing possible whale detections in satellite imagery.<n>We use a statistical anomaly detection method that flags spatial outliers, i.e. "interesting points"<n>We achieve recalls of 90.3% to 96.4%, while reducing the area requiring expert inspection by up to 99.8%.
arXiv Detail & Related papers (2025-10-16T14:10:51Z) - Automated Detection of Salvin's Albatrosses: Improving Deep Learning Tools for Aerial Wildlife Surveys [4.936287307711449]
Unmanned Aerial Vehicles (UAVs) provide a cost-effective means of capturing high-resolution imagery.<n>We assess the performance of a general-purpose avian detection model, BirdDetector, in estimating the breeding population of Salvin's albatross (Thalassarche salvini) on the Bounty Islands, New Zealand.
arXiv Detail & Related papers (2025-05-15T22:42:44Z) - Weakly Supervised Multiple Instance Learning for Whale Call Detection and Localization in Long-Duration Passive Acoustic Monitoring [2.7418627495572134]
We introduce DSMIL-LocNet, a framework for whale call detection and localization using only bag-level labels.<n>Our dual-stream model processes 2-30 minute audio segments, leveraging spectral and temporal features with attention-based instance selection.
arXiv Detail & Related papers (2025-02-28T08:34:12Z) - Underwater Camouflaged Object Tracking Meets Vision-Language SAM2 [60.47622353256502]
We propose the first large-scale multi-modal underwater camouflaged object tracking dataset, namely UW-COT220.<n>Based on the proposed dataset, this work first evaluates current advanced visual object tracking methods, including SAM- and SAM2-based trackers, in challenging underwater environments.<n>Our findings highlight the improvements of SAM2 over SAM, demonstrating its enhanced ability to handle the complexities of underwater camouflaged objects.
arXiv Detail & Related papers (2024-09-25T13:10:03Z) - SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection [59.868772767818975]
We propose a simple yet effective Semi-supervised Oriented Object Detection method termed SOOD++.
Specifically, we observe that objects from aerial images are usually arbitrary orientations, small scales, and aggregation.
Extensive experiments conducted on various multi-oriented object datasets under various labeled settings demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-01T07:03:51Z) - Detecting Endangered Marine Species in Autonomous Underwater Vehicle Imagery Using Point Annotations and Few-Shot Learning [5.439798554380394]
Seafloor imagery collected by Autonomous Underwater Vehicles (AUVs) can be used to identify individuals within their broader habitat context.
Machine learning models can be used to identify the presence of a particular species in images using a trained object detector.
In this paper, inspired by recent work in few-shot learning, images and annotations of common marine species are exploited to enhance the ability of the detector to identify rare and cryptic species.
arXiv Detail & Related papers (2024-06-04T03:31:42Z) - Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - Whale Detection Enhancement through Synthetic Satellite Images [13.842008598751445]
We show that we can achieve a 15% performance boost on whale detection compared to using the real data alone for training.
We open source the code of the simulation platform SeaDroneSim2 and the dataset generated through it.
arXiv Detail & Related papers (2023-08-15T13:35:29Z) - SOOD: Towards Semi-Supervised Oriented Object Detection [57.05141794402972]
This paper proposes a novel Semi-supervised Oriented Object Detection model, termed SOOD, built upon the mainstream pseudo-labeling framework.
Our experiments show that when trained with the two proposed losses, SOOD surpasses the state-of-the-art SSOD methods under various settings on the DOTA-v1.5 benchmark.
arXiv Detail & Related papers (2023-04-10T11:10:42Z) - TempNet: Temporal Attention Towards the Detection of Animal Behaviour in
Videos [63.85815474157357]
We propose an efficient computer vision- and deep learning-based method for the detection of biological behaviours in videos.
TempNet uses an encoder bridge and residual blocks to maintain model performance with a two-staged, spatial, then temporal, encoder.
We demonstrate its application to the detection of sablefish (Anoplopoma fimbria) startle events.
arXiv Detail & Related papers (2022-11-17T23:55:12Z) - Detecting Cattle and Elk in the Wild from Space [6.810164473908359]
Localizing and counting large ungulates in satellite imagery is an important task for supporting ecological studies.
We propose a baseline method, CowNet, that simultaneously estimates the number of animals in an image (counts) and predicts their location at a pixel level (localizes)
We specifically test the temporal generalization of the resulting models over a large landscape in Point Reyes Seashore, CA.
arXiv Detail & Related papers (2021-06-29T14:35:23Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.