BIRB: A Generalization Benchmark for Information Retrieval in
Bioacoustics
- URL: http://arxiv.org/abs/2312.07439v2
- Date: Wed, 13 Dec 2023 15:48:17 GMT
- Title: BIRB: A Generalization Benchmark for Information Retrieval in
Bioacoustics
- Authors: Jenny Hamer, Eleni Triantafillou, Bart van Merri\"enboer, Stefan Kahl,
Holger Klinck, Tom Denton, Vincent Dumoulin
- Abstract summary: We present BIRB, a complex benchmark centered on the retrieval of bird vocalizations from passively-recorded datasets.
We propose a baseline system for this collection of tasks using representation learning and a nearest-centroid search.
- Score: 7.68184437595058
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability for a machine learning model to cope with differences in training
and deployment conditions--e.g. in the presence of distribution shift or the
generalization to new classes altogether--is crucial for real-world use cases.
However, most empirical work in this area has focused on the image domain with
artificial benchmarks constructed to measure individual aspects of
generalization. We present BIRB, a complex benchmark centered on the retrieval
of bird vocalizations from passively-recorded datasets given focal recordings
from a large citizen science corpus available for training. We propose a
baseline system for this collection of tasks using representation learning and
a nearest-centroid search. Our thorough empirical evaluation and analysis
surfaces open research directions, suggesting that BIRB fills the need for a
more realistic and complex benchmark to drive progress on robustness to
distribution shifts and generalization of ML models.
Related papers
- When is an Embedding Model More Promising than Another? [33.540506562970776]
Embedders play a central role in machine learning, projecting any object into numerical representations that can be leveraged to perform various downstream tasks.
The evaluation of embedding models typically depends on domain-specific empirical approaches.
We present a unified approach to evaluate embedders, drawing upon the concepts of sufficiency and informativeness.
arXiv Detail & Related papers (2024-06-11T18:13:46Z) - Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in
Dense Encoders [63.28408887247742]
We study whether training procedures can be improved to yield better generalization capabilities in the resulting models.
We recommend a simple recipe for training dense encoders: Train on MSMARCO with parameter-efficient methods, such as LoRA, and opt for using in-batch negatives unless given well-constructed hard negatives.
arXiv Detail & Related papers (2023-11-16T10:42:58Z) - LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion [20.545058017790428]
Imitation Learning holds great promise for enabling agile locomotion in embodied agents.
We present a novel benchmark designed to facilitate rigorous evaluation and comparison of IL algorithms.
This benchmark encompasses a diverse set of environments, including quadrupeds, bipeds, and musculoskeletal human models.
arXiv Detail & Related papers (2023-11-04T19:41:50Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Current Trends in Deep Learning for Earth Observation: An Open-source
Benchmark Arena for Image Classification [7.511257876007757]
'AiTLAS: Benchmark Arena' is an open-source benchmark framework for evaluating state-of-the-art deep learning approaches for image classification.
We present a comprehensive comparative analysis of more than 400 models derived from nine different state-of-the-art architectures.
arXiv Detail & Related papers (2022-07-14T20:18:58Z) - Semi-Supervised Domain Generalization with Stochastic StyleMatch [90.98288822165482]
In real-world applications, we might have only a few labels available from each source domain due to high annotation cost.
In this work, we investigate semi-supervised domain generalization, a more realistic and practical setting.
Our proposed approach, StyleMatch, is inspired by FixMatch, a state-of-the-art semi-supervised learning method based on pseudo-labeling.
arXiv Detail & Related papers (2021-06-01T16:00:08Z) - BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information
Retrieval Models [41.45240621979654]
We introduce BEIR, a heterogeneous benchmark for information retrieval.
We study the effectiveness of nine state-of-the-art retrieval models in a zero-shot evaluation setup.
Dense-retrieval models are computationally more efficient but often underperform other approaches.
arXiv Detail & Related papers (2021-04-17T23:29:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.