Related papers: pycnet-audio: A Python package to support bioacoustics data processing

pycnet-audio: A Python package to support bioacoustics data processing

URL: http://arxiv.org/abs/2506.14864v1
Date: Tue, 17 Jun 2025 17:40:21 GMT
Title: pycnet-audio: A Python package to support bioacoustics data processing
Authors: Zachary J. Ruff, Damon B. Lesmeister,
Abstract summary: pycnet-audio is intended to provide a practical processing workflow for acoustic data.<n> pycnet-audio was originally developed by the U.S. Forest Service to support population monitoring of northern spotted owls.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Passive acoustic monitoring is an emerging approach in wildlife research that leverages recent improvements in purpose-made automated recording units (ARUs). The general approach is to deploy ARUs in the field to record on a programmed schedule for extended periods (weeks or months), after which the audio data are retrieved. These data must then be processed, typically either by measuring or analyzing characteristics of the audio itself (e.g. calculating acoustic indices), or by searching for some signal of interest within the recordings, e.g. vocalizations or other sounds produced by some target species, anthropogenic or environmental noise, etc. In the latter case, some method is required to locate the signal(s) of interest within the audio. While very small datasets can simply be searched manually, even modest projects can produce audio datasets on the order of 105 hours of recordings, making manual review impractical and necessitating some form of automated detection. pycnet-audio (Ruff 2024) is intended to provide a practical processing workflow for acoustic data, built around the PNW-Cnet model, which was initially developed by the U.S. Forest Service to support population monitoring of northern spotted owls (Strix occidentalis caurina) and other forest owls (Lesmeister and Jenkins 2022; Ruff et al. 2020). PNW-Cnet has been expanded to detect vocalizations of ca. 80 forest wildlife species and numerous forms of anthropogenic and environmental noise (Ruff et al. 2021, 2023).

Related papers

The iNaturalist Sounds Dataset [60.157076990024606]
iNatSounds is a collection of 230,000 audio files capturing sounds from over 5,500 species, contributed by more than 27,000 recordists worldwide.<n>The dataset encompasses sounds from birds, mammals, insects, reptiles, and amphibians, with audio and species labels derived from observations submitted to iNaturalist.<n>We envision models trained on this data powering next-generation public engagement applications, and assisting biologists, ecologists, and land use managers in processing large audio collections.
arXiv Detail & Related papers (2025-05-31T02:07:37Z)
ECOSoundSet: a finely annotated dataset for the automated acoustic identification of Orthoptera and Cicadidae in North, Central and temperate Western Europe [51.82780272068934]
We present ECOSoundSet (European Cicadidae and Orthoptera Sound dataSet), a dataset containing 10,653 recordings of 200 orthopteran and 24 cicada species (217 and 26 respective taxa when including subspecies) present in North, Central, and temperate Western Europe.<n>This dataset could serve as a meaningful complement to recordings already available online for the training of deep learning algorithms for the acoustic classification of orthopterans and cicadas in North, Central, and temperate Western Europe.
arXiv Detail & Related papers (2025-04-29T13:53:33Z)
Automated Bioacoustic Monitoring for South African Bird Species on Unlabeled Data [1.3506669466260703]
The framework automatically extracted labeled data from available platforms for selected avian species. The labeled data were embedded into recordings, including environmental sounds and noise, and were used to train convolutional recurrent neural network (CRNN) models. The Adapted SED-CRNN model reached a F1 score of 0.73, demonstrating its efficiency under noisy, real-world conditions.
arXiv Detail & Related papers (2024-06-19T14:14:24Z)
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark [65.79402756995084]
Real Acoustic Fields (RAF) is a new dataset that captures real acoustic room data from multiple modalities. RAF is the first dataset to provide densely captured room acoustic data.
arXiv Detail & Related papers (2024-03-27T17:59:56Z)
Exploring Meta Information for Audio-based Zero-shot Bird Classification [113.17261694996051]
This study investigates how meta-information can improve zero-shot audio classification. We use bird species as an example case study due to the availability of rich and diverse meta-data.
arXiv Detail & Related papers (2023-09-15T13:50:16Z)
Transferable Models for Bioacoustics with Human Language Supervision [0.0]
BioLingual is a new model for bioacoustics based on contrastive language-audio pretraining. It can identify over a thousand species' calls across taxa, complete bioacoustic tasks zero-shot, and retrieve animal vocalization recordings from natural text queries.
arXiv Detail & Related papers (2023-08-09T14:22:18Z)
Self-Supervised Visual Acoustic Matching [63.492168778869726]
Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a target acoustic environment. We propose a self-supervised approach to visual acoustic matching where training samples include only the target scene image and audio. Our approach jointly learns to disentangle room acoustics and re-synthesize audio into the target environment, via a conditional GAN framework and a novel metric.
arXiv Detail & Related papers (2023-07-27T17:59:59Z)
Detection and classification of vocal productions in large scale audio recordings [0.12930503923129208]
We propose an automatic data processing pipeline to extract vocal productions from large-scale natural audio recordings. The pipeline is based on a deep neural network and adresses both issues simultaneously. We test it on two different natural audio data sets, one from a group of Guinea baboons recorded from a primate research center and one from human babies recorded at home.
arXiv Detail & Related papers (2023-02-14T14:07:09Z)
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification [42.95038619688867]
ASiT is a novel self-supervised learning framework that captures local and global contextual information by employing group masked model learning and self-distillation. We evaluate our pretrained models on both audio and speech classification tasks, including audio event classification, keyword spotting, and speaker identification.
arXiv Detail & Related papers (2022-11-23T18:21:09Z)
HEAR 2021: Holistic Evaluation of Audio Representations [55.324557862041985]
The HEAR 2021 NeurIPS challenge is to develop a general-purpose audio representation that provides a strong basis for learning. HEAR 2021 evaluates audio representations using a benchmark suite across a variety of domains, including speech, environmental sound, and music. Twenty-nine models by thirteen external teams were evaluated on nineteen diverse downstream tasks derived from sixteen datasets.
arXiv Detail & Related papers (2022-03-06T18:13:09Z)
Modelling Animal Biodiversity Using Acoustic Monitoring and Deep Learning [0.0]
This paper outlines an approach for achieving this using state of the art in machine learning to automatically extract features from time-series audio signals. The acquired bird songs are processed using mel-frequency cepstrum (MFC) to extract features which are later classified using a multilayer perceptron (MLP) Our proposed method achieved promising results with 0.74 sensitivity, 0.92 specificity and an accuracy of 0.74.
arXiv Detail & Related papers (2021-03-12T13:50:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.