Related papers: Leveraging Habitat Information for Fine-grained Bird Identification

Leveraging Habitat Information for Fine-grained Bird Identification

URL: http://arxiv.org/abs/2312.14999v1
Date: Fri, 22 Dec 2023 16:23:22 GMT
Title: Leveraging Habitat Information for Fine-grained Bird Identification
Authors: Tin Nguyen, Anh Nguyen
Abstract summary: We are the first to explore integrating habitat information, one of the four major cues for identifying birds by ornithologists, into modern bird classifiers. We focus on two leading model types: CNNs and ViTs trained on the downstream bird datasets; and original, multi-modal CLIP. Training CNNs and ViTs with habitat-augmented data results in an improvement of up to +0.83 and +0.23 points on NABirds and CUB-200, respectively.
Score: 4.392299539811761
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Traditional bird classifiers mostly rely on the visual characteristics of birds. Some prior works even train classifiers to be invariant to the background, completely discarding the living environment of birds. Instead, we are the first to explore integrating habitat information, one of the four major cues for identifying birds by ornithologists, into modern bird classifiers. We focus on two leading model types: (1) CNNs and ViTs trained on the downstream bird datasets; and (2) original, multi-modal CLIP. Training CNNs and ViTs with habitat-augmented data results in an improvement of up to +0.83 and +0.23 points on NABirds and CUB-200, respectively. Similarly, adding habitat descriptors to the prompts for CLIP yields a substantial accuracy boost of up to +0.99 and +1.1 points on NABirds and CUB-200, respectively. We find consistent accuracy improvement after integrating habitat features into the image augmentation process and into the textual descriptors of vision-language CLIP classifiers. Code is available at: https://anonymous.4open.science/r/reasoning-8B7E/.

Related papers

Modeling Habitat Shifts: Integrating Convolutional Neural Networks and Tabular Data for Species Migration Prediction [0.0]
We propose a solution to accurately model whether bird species are present in a specific habitat.<n>Our approach makes use of satellite imagery and environmental features to predict bird presence across various climates.<n>Both systems predict the distribution of birds with an average accuracy of 85%, offering a scalable but reliable method to understand bird migration.
arXiv Detail & Related papers (2025-07-15T05:17:58Z)
BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning [51.341003735575335]
We find emergent behaviors in biological vision models via large-scale contrastive vision-language training.<n>We train BioCLIP 2 on TreeOfLife-200M to distinguish different species.<n>We identify emergent properties in the learned embedding space of BioCLIP 2.
arXiv Detail & Related papers (2025-05-29T17:48:20Z)
A Bird Song Detector for improving bird identification through Deep Learning: a case study from Doñana [2.7924253850013416]
We develop a pipeline for automatic bird vocalization identification in Donana National Park (SW Spain) We manually annotated 461 minutes of audio from three habitats across nine locations, yielding 3,749 annotations for 34 classes. Applying the Bird Song Detector before classification improved species identification, as all classification models performed better when analyzing only the segments where birds were detected.
arXiv Detail & Related papers (2025-03-19T13:19:06Z)
External Knowledge Injection for CLIP-Based Class-Incremental Learning [62.516402566610395]
Class-Incremental Learning (CIL) enables learning systems to continuously adapt to evolving data streams. We introduce ExterNal knowledGe INjEction (ENGINE) for CLIP-based CIL.
arXiv Detail & Related papers (2025-03-11T15:00:22Z)
Visual WetlandBirds Dataset: Bird Species Identification and Behavior Recognition in Videos [0.0]
This study introduces the first fine-grained video dataset specifically designed for bird behavior detection and species classification. The proposed dataset comprises 178 videos recorded in Spanish wetlands, capturing 13 different bird species performing 7 distinct behavior classes.
arXiv Detail & Related papers (2025-01-15T16:34:20Z)
AudioProtoPNet: An interpretable deep learning model for bird sound classification [1.49199020343864]
This study introduces AudioProtoPNet, an adaptation of the Prototypical Part Network (ProtoPNet) for multi-label bird sound classification. It is an inherently interpretable model that uses a ConvNeXt backbone to extract embeddings. The model was trained on the BirdSet training dataset, which consists of 9,734 bird species and over 6,800 hours of recordings.
arXiv Detail & Related papers (2024-04-16T09:37:41Z)
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping [22.30038765017189]
We propose a metadata-aware self-supervised learning(SSL) framework useful for fine-grained classification and ecological mapping of bird species around the world. Our framework unifies two SSL strategies: Contrastive Learning(CL) and Masked Image Modeling(MIM), while also enriching the embedding space with metadata available with ground-level imagery of birds. We demonstrate that our models learn fine-grained and geographically conditioned features of birds, by evaluating on two downstream tasks: fine-grained visual classification(FGVC) and cross-modal retrieval.
arXiv Detail & Related papers (2023-10-29T22:08:00Z)
Exploring Meta Information for Audio-based Zero-shot Bird Classification [113.17261694996051]
This study investigates how meta-information can improve zero-shot audio classification. We use bird species as an example case study due to the availability of rich and diverse meta-data.
arXiv Detail & Related papers (2023-09-15T13:50:16Z)
Recognition of Unseen Bird Species by Learning from Field Guides [23.137536032163855]
We exploit field guides to learn bird species recognition, in particular zero-shot recognition of unseen species. We study two approaches: (1) a contrastive encoding of illustrations, which can be fed into standard zero-shot learning schemes; and (2) a novel method that leverages the fact that illustrations are also images. Our results show that illustrations from field guides, which are readily available for a wide range of species, are indeed a competitive source of side information for zero-shot learning.
arXiv Detail & Related papers (2022-06-03T09:13:46Z)
Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution. First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers. Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z)
No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data. We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model. Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
Your "Flamingo" is My "Bird": Fine-Grained, or Not [60.25769809922673]
We investigate how to tailor for different fine-grained definitions under divergent levels of expertise. We first conduct a comprehensive human study where we confirm that most participants prefer multi-granularity labels. We then discover the key intuition that: coarse-level label prediction exacerbates fine-grained feature learning.
arXiv Detail & Related papers (2020-11-18T02:24:54Z)
ALICE: Active Learning with Contrastive Natural Language Explanations [69.03658685761538]
We propose Active Learning with Contrastive Explanations (ALICE) to improve data efficiency in learning. ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations. It extracts knowledge from these explanations using a semantically extracted knowledge.
arXiv Detail & Related papers (2020-09-22T01:02:07Z)
Feathers dataset for Fine-Grained Visual Categorization [0.0]
FeatherV1 is the first publicly available bird's plumage dataset for machine learning. It can raise interest for a new task in fine-grained visual recognition domain.
arXiv Detail & Related papers (2020-04-18T12:40:43Z)
Transferring Dense Pose to Proximal Animal Classes [83.84439508978126]
We show that it is possible to transfer the knowledge existing in dense pose recognition for humans, as well as in more general object detectors and segmenters, to the problem of dense pose recognition in other classes. We do this by establishing a DensePose model for the new animal which is also geometrically aligned to humans. We also introduce two benchmark datasets labelled in the manner of DensePose for the class chimpanzee and use them to evaluate our approach.
arXiv Detail & Related papers (2020-02-28T21:43:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.