Uncovering Anomalous Events for Marine Environmental Monitoring via Visual Anomaly Detection
- URL: http://arxiv.org/abs/2510.10750v2
- Date: Thu, 23 Oct 2025 15:59:05 GMT
- Title: Uncovering Anomalous Events for Marine Environmental Monitoring via Visual Anomaly Detection
- Authors: Laura Weihl, Stefan H. Bengtson, Nejc Novak, Malte Pedersen,
- Abstract summary: We introduce AURA, the first multi-annotator benchmark dataset for underwater visual anomaly detection (VAD)<n>We evaluate four VAD models across two marine scenes and show the importance of robust frame selection strategies to extract meaningful video segments.<n>Our results highlight the value of soft and consensus labels and offer a practical approach for supporting scientific exploration and scalable biodiversity monitoring.
- Score: 2.257416403770908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Underwater video monitoring is a promising strategy for assessing marine biodiversity, but the vast volume of uneventful footage makes manual inspection highly impractical. In this work, we explore the use of visual anomaly detection (VAD) based on deep neural networks to automatically identify interesting or anomalous events. We introduce AURA, the first multi-annotator benchmark dataset for underwater VAD, and evaluate four VAD models across two marine scenes. We demonstrate the importance of robust frame selection strategies to extract meaningful video segments. Our comparison against multiple annotators reveals that VAD performance of current models varies dramatically and is highly sensitive to both the amount of training data and the variability in visual content that defines "normal" scenes. Our results highlight the value of soft and consensus labels and offer a practical approach for supporting scientific exploration and scalable biodiversity monitoring.
Related papers
- Expose Camouflage in the Water: Underwater Camouflaged Instance Segmentation and Dataset [76.92197418745822]
camouflaged instance segmentation (CIS) faces greater challenges in accurately segmenting objects that blend closely with their surroundings.<n>Traditional camouflaged instance segmentation methods, trained on terrestrial-dominated datasets with limited underwater samples, may exhibit inadequate performance in underwater scenes.<n>We introduce the first underwater camouflaged instance segmentation dataset, UCIS4K, which comprises 3,953 images of camouflaged marine organisms with instance-level annotations.
arXiv Detail & Related papers (2025-10-20T14:34:51Z) - Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection [54.1960918379255]
Neptune-X is a data-centric generative-selection framework for maritime object detection.<n>X-to-Maritime is a multi-modality-conditioned generative model that synthesizes diverse and realistic maritime scenes.<n>Our approach sets a new benchmark in maritime scene synthesis, significantly improving detection accuracy.
arXiv Detail & Related papers (2025-09-25T04:59:02Z) - FishDet-M: A Unified Large-Scale Benchmark for Robust Fish Detection and CLIP-Guided Model Selection in Diverse Aquatic Visual Domains [1.3791394805787949]
FishDet-M is the largest unified benchmark for fish detection, comprising 13 publicly available datasets spanning diverse aquatic environments.<n>All data are harmonized using COCO-style annotations with both bounding boxes and segmentation masks.<n>FishDet-M establishes a standardized and reproducible platform for evaluating object detection in complex aquatic scenes.
arXiv Detail & Related papers (2025-07-23T18:32:01Z) - UVLM: Benchmarking Video Language Model for Underwater World Understanding [11.475921633970977]
We introduce UVLM, a benchmark for underwater video observation.<n> dataset includes 419 classes of marine animals, and various static plants and terrains.<n>Experiments on two representative VidLMs demonstrate that fine-tuning VidLMs on UVLM significantly improves underwater world understanding.
arXiv Detail & Related papers (2025-07-03T07:08:38Z) - From Sight to Insight: Unleashing Eye-Tracking in Weakly Supervised Video Salient Object Detection [60.11169426478452]
This paper aims to introduce fixation information to assist the detection of salient objects under weak supervision.<n>We propose a Position and Semantic Embedding (PSE) module to provide location and semantic guidance during the feature learning process.<n>An Intra-Inter Mixed Contrastive (MCII) model improves thetemporal modeling capabilities under weak supervision.
arXiv Detail & Related papers (2025-06-30T05:01:40Z) - Inland Waterway Object Detection in Multi-environment: Dataset and Approach [12.00732943849236]
This paper introduces the Multi-environment Inland Waterway Vessel dataset (MEIWVD)<n>MEIWVD comprises 32,478 high-quality images from diverse scenarios, including sunny, rainy, foggy, and artificial lighting conditions.<n>This paper proposes a scene-guided image enhancement module to improve water surface images based on environmental conditions adaptively.
arXiv Detail & Related papers (2025-04-07T08:45:00Z) - Image-Based Relocalization and Alignment for Long-Term Monitoring of Dynamic Underwater Environments [57.59857784298534]
We propose an integrated pipeline that combines Visual Place Recognition (VPR), feature matching, and image segmentation on video-derived images.<n>This method enables robust identification of revisited areas, estimation of rigid transformations, and downstream analysis of ecosystem changes.
arXiv Detail & Related papers (2025-03-06T05:13:19Z) - Multi-Domain Features Guided Supervised Contrastive Learning for Radar Target Detection [8.706031869122917]
Existing solutions either model sea clutter for detection or extract target features based on clutter-target echo differences, including statistical and deep features.<n>We propose a multi-domain features guided supervised contrastive learning (MDFG_SCL) method, which integrates statistical features derived from multi-domain differences with deep features obtained through supervised contrastive learning.<n>Experiments conducted on real-world datasets demonstrate that the proposed shallow-to-deep detector not only achieves effective identification of small maritime targets but also maintains superior detection performance across varying sea conditions.
arXiv Detail & Related papers (2024-12-17T07:33:07Z) - VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs [64.60035916955837]
VANE-Bench is a benchmark designed to assess the proficiency of Video-LMMs in detecting anomalies and inconsistencies in videos.<n>Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models.<n>We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies.
arXiv Detail & Related papers (2024-06-14T17:59:01Z) - SeaDSC: A video-based unsupervised method for dynamic scene change
detection in unmanned surface vehicles [3.2716252389196288]
This paper outlines our approach to detect dynamic scene changes in Unmanned Surface Vehicles (USVs)
Our objective is to identify significant changes in the dynamic scenes of maritime video data, particularly those scenes that exhibit a high degree of resemblance.
In our system for dynamic scene change detection, we propose completely unsupervised learning method.
arXiv Detail & Related papers (2023-11-20T07:34:01Z) - Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition [86.31412529187243]
Few-shot video recognition aims at learning new actions with only very few labeled samples.
We propose a depth guided Adaptive Meta-Fusion Network for few-shot video recognition which is termed as AMeFu-Net.
arXiv Detail & Related papers (2020-10-20T03:06:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.