Related papers: Ecologically Valid Benchmarking and Adaptive Attention: Scalable Marine Bioacoustic Monitoring

Ecologically Valid Benchmarking and Adaptive Attention: Scalable Marine Bioacoustic Monitoring

URL: http://arxiv.org/abs/2509.04682v1
Date: Thu, 04 Sep 2025 22:03:05 GMT
Title: Ecologically Valid Benchmarking and Adaptive Attention: Scalable Marine Bioacoustic Monitoring
Authors: Nicholas R. Rasmussen, Rodrigue Rizk, Longwei Wang, KC Santosh,
Abstract summary: GetNetUPAM is a nested cross-validation framework to model stability under realistic variability.<n>Data are partitioned into distinct site-year segments, preserving recording and ensuring each validation fold reflects a unique environmental subset.<n>ARPA-N achieves a 14.4% gain in average precision over DenseNet baselines and a log2-scale order-of-magnitude drop in variability across all metrics.
Score: 2.558238597112103
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Underwater Passive Acoustic Monitoring (UPAM) provides rich spatiotemporal data for long-term ecological analysis, but intrinsic noise and complex signal dependencies hinder model stability and generalization. Multilayered windowing has improved target sound localization, yet variability from shifting ambient noise, diverse propagation effects, and mixed biological and anthropogenic sources demands robust architectures and rigorous evaluation. We introduce GetNetUPAM, a hierarchical nested cross-validation framework designed to quantify model stability under ecologically realistic variability. Data are partitioned into distinct site-year segments, preserving recording heterogeneity and ensuring each validation fold reflects a unique environmental subset, reducing overfitting to localized noise and sensor artifacts. Site-year blocking enforces evaluation against genuine environmental diversity, while standard cross-validation on random subsets measures generalization across UPAM's full signal distribution, a dimension absent from current benchmarks. Using GetNetUPAM as the evaluation backbone, we propose the Adaptive Resolution Pooling and Attention Network (ARPA-N), a neural architecture for irregular spectrogram dimensions. Adaptive pooling with spatial attention extends the receptive field, capturing global context without excessive parameters. Under GetNetUPAM, ARPA-N achieves a 14.4% gain in average precision over DenseNet baselines and a log2-scale order-of-magnitude drop in variability across all metrics, enabling consistent detection across site-year folds and advancing scalable, accurate bioacoustic monitoring.

Related papers

AgentNoiseBench: Benchmarking Robustness of Tool-Using LLM Agents Under Noisy Condition [72.24180896265192]
We introduce AgentNoiseBench, a framework for evaluating robustness of agentic models under noisy environments.<n>We first conduct an in-depth analysis of biases and uncertainties in real-world scenarios.<n>We then categorize environmental noise into two primary types: user-noise and tool-noise.<n>Building on this analysis, we develop an automated pipeline that injects controllable noise into existing agent-centric benchmarks.
arXiv Detail & Related papers (2026-02-11T20:33:10Z)
Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition [2.0391237204597363]
Speech Emotion Recognition systems often degrade in performance when exposed to unpredictable acoustic interference.<n>We propose a Hybrid Transformer-CNN framework that unifies the contextual modeling of Wav2Vec 2.0 with the spectral stability of 1D-Convolutional Neural Networks.
arXiv Detail & Related papers (2025-12-20T10:05:58Z)
Multi-Representation Attention Framework for Underwater Bioacoustic Denoising and Recognition [0.924965746838578]
We introduce a multi-step, attention-guided framework that first segments spectrograms to generate soft masks of biologically relevant energy.<n>Image and mask embeddings are integrated via mid-level fusion, enabling the model to focus on salient spectrogram regions.<n>Using real-world recordings from the Saguenay St. Lawrence Marine Park Research Station in Canada, we demonstrate that segmentation-driven attention and mid-level fusion improve signal discrimination.
arXiv Detail & Related papers (2025-10-29T22:49:15Z)
Global-focal Adaptation with Information Separation for Noise-robust Transfer Fault Diagnosis [48.69961294481149]
We propose an information separation global-focal adversarial network (ISGFAN) for cross-domain fault diagnosis under noise conditions.<n>ISGFAN is built on an information separation architecture that integrates adversarial learning with an improved loss to decouple domain-invariant fault representation.<n>Experiments conducted on three public datasets demonstrate that the proposed method outperforms other prominent existing approaches.
arXiv Detail & Related papers (2025-10-16T01:13:06Z)
FedGSCA: Medical Federated Learning with Global Sample Selector and Client Adaptive Adjuster under Label Noise [3.5585588724306643]
Federated Learning (FL) emerged as a solution for collaborative medical image classification while preserving data privacy.<n>We propose FedGSCA, a novel framework for enhancing robustness in noisy medical FL.<n>We evaluate FedGSCA on one real-world colon slides dataset and two synthetic medical datasets under various noise conditions.
arXiv Detail & Related papers (2025-07-13T08:51:51Z)
Adaptive Control Attention Network for Underwater Acoustic Localization and Domain Adaptation [8.017203108408973]
Localizing acoustic sound sources in the ocean is a challenging task due to the complex and dynamic nature of the environment.<n>We propose a multi-branch network architecture designed to accurately predict the distance between a moving acoustic source and a receiver.<n>Our proposed method outperforms state-of-the-art (SOTA) approaches in similar settings.
arXiv Detail & Related papers (2025-06-20T18:13:30Z)
Evaluating ML Robustness in GNSS Interference Classification, Characterization & Localization [42.14439854721613]
Jamming devices disrupt signals from the global navigation satellite system (GNSS)<n>This paper introduces an extensive dataset comprising snapshots obtained from a low-frequency antenna.<n>Our objective is to assess the resilience of machine learning (ML) models against environmental changes.
arXiv Detail & Related papers (2024-09-23T15:20:33Z)
Multi-Source and Test-Time Domain Adaptation on Multivariate Signals using Spatio-Temporal Monge Alignment [59.75420353684495]
Machine learning applications on signals such as computer vision or biomedical data often face challenges due to the variability that exists across hardware devices or session recordings. In this work, we propose Spatio-Temporal Monge Alignment (STMA) to mitigate these variabilities. We show that STMA leads to significant and consistent performance gains between datasets acquired with very different settings.
arXiv Detail & Related papers (2024-07-19T13:33:38Z)
Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation [91.83820250747935]
Pseudo-label noise is mainly contained in unstable samples in which predictions of most pixels undergo significant variations during self-training. We introduce the Stable Neighbor Denoising (SND) approach, which effectively discovers highly correlated stable and unstable samples. SND consistently outperforms state-of-the-art methods in various SFUDA semantic segmentation settings.
arXiv Detail & Related papers (2024-06-10T21:44:52Z)
AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks. Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo. Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z)
Certified Adversarial Defenses Meet Out-of-Distribution Corruptions: Benchmarking Robustness and Simple Baselines [65.0803400763215]
This work critically examines how adversarial robustness guarantees change when state-of-the-art certifiably robust models encounter out-of-distribution data. We propose a novel data augmentation scheme, FourierMix, that produces augmentations to improve the spectral coverage of the training data. We find that FourierMix augmentations help eliminate the spectral bias of certifiably robust models enabling them to achieve significantly better robustness guarantees on a range of OOD benchmarks.
arXiv Detail & Related papers (2021-12-01T17:11:22Z)
Discriminative Singular Spectrum Classifier with Applications on Bioacoustic Signal Recognition [67.4171845020675]
We present a bioacoustic signal classifier equipped with a discriminative mechanism to extract useful features for analysis and classification efficiently. Unlike current bioacoustic recognition methods, which are task-oriented, the proposed model relies on transforming the input signals into vector subspaces. The validity of the proposed method is verified using three challenging bioacoustic datasets containing anuran, bee, and mosquito species.
arXiv Detail & Related papers (2021-03-18T11:01:21Z)
Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification. Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.