Related papers: WildlifeDatasets: An open-source toolkit for animal re-identification

WildlifeDatasets: An open-source toolkit for animal re-identification

URL: http://arxiv.org/abs/2311.09118v2
Date: Thu, 14 Dec 2023 08:04:16 GMT
Title: WildlifeDatasets: An open-source toolkit for animal re-identification
Authors: Vojt\v{e}ch \v{C}erm\'ak, Lukas Picek, Luk\'a\v{s} Adam, Kostas Papafitsoros
Abstract summary: WildlifeDatasets is an open-source toolkit for ecologists and computer-vision / machine-learning researchers. WildlifeDatasets is written in Python and allows straightforward access to publicly available wildlife datasets. We provide the first-ever foundation model for individual re-identification within a wide range of species - MegaDescriptor.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this paper, we present WildlifeDatasets (https://github.com/WildlifeDatasets/wildlife-datasets) - an open-source toolkit intended primarily for ecologists and computer-vision / machine-learning researchers. The WildlifeDatasets is written in Python, allows straightforward access to publicly available wildlife datasets, and provides a wide variety of methods for dataset pre-processing, performance analysis, and model fine-tuning. We showcase the toolkit in various scenarios and baseline experiments, including, to the best of our knowledge, the most comprehensive experimental comparison of datasets and methods for wildlife re-identification, including both local descriptors and deep learning approaches. Furthermore, we provide the first-ever foundation model for individual re-identification within a wide range of species - MegaDescriptor - that provides state-of-the-art performance on animal re-identification datasets and outperforms other pre-trained models such as CLIP and DINOv2 by a significant margin. To make the model available to the general public and to allow easy integration with any existing wildlife monitoring applications, we provide multiple MegaDescriptor flavors (i.e., Small, Medium, and Large) through the HuggingFace hub (https://huggingface.co/BVRA).

Related papers

Wildlife Target Re-Identification Using Self-supervised Learning in Non-Urban Settings [0.0]
Wildlife re-identification aims to match individuals of the same species across different observations.<n>Current state-of-the-art (SOTA) models rely on class labels to train supervised models for individual classification.<n>This study investigates self-supervised learning Self-Supervised Learning (SSL) for wildlife re-identification.
arXiv Detail & Related papers (2025-07-03T07:56:54Z)
OpenWildlife: Open-Vocabulary Multi-Species Wildlife Detector for Geographically-Diverse Aerial Imagery [5.612783442210011]
We introduce OpenWildlife, an open-vocabulary wildlife detector designed for multi-species identification in diverse aerial imagery.<n>OW leverages language-aware embeddings and a novel adaptation of the Grounding-DINO framework, enabling it to identify species specified through natural language inputs across both terrestrial and marine environments.<n>OW outperforms most existing methods, achieving up to textbf0.981 mAP50 with fine-tuning and textbf0.597 mAP50 on seven datasets featuring novel species.
arXiv Detail & Related papers (2025-06-24T00:10:19Z)
Meta-Feature Adapter: Integrating Environmental Metadata for Enhanced Animal Re-identification [7.272706868932979]
We propose a lightweight module designed to integrate environmental metadata into vision-language foundation models, such as CLIP. Our approach translates environmental metadata into natural language descriptions, encodes them into metadata-aware text embeddings, and incorporates these embeddings into image features through a cross-attention mechanism.
arXiv Detail & Related papers (2025-01-23T04:14:59Z)
Multispecies Animal Re-ID Using a Large Community-Curated Dataset [0.19418036471925312]
We construct a dataset that includes 49 species, 37K individual animals, and 225K images, using this data to train a single embedding network for all species. Our model consistently outperforms models trained separately on each species, achieving an average gain of 12.5% in top-1 accuracy. The model is already in production use for 60+ species in a large-scale wildlife monitoring system.
arXiv Detail & Related papers (2024-12-07T09:56:33Z)
Categorical Keypoint Positional Embedding for Robust Animal Re-Identification [22.979350771097966]
Animal re-identification (ReID) has become an indispensable tool in ecological research. Unlike human ReID, animal ReID faces significant challenges due to the high variability in animal poses, diverse environmental conditions, and the inability to directly apply pre-trained models to animal data. This work introduces an innovative keypoint propagation mechanism, which utilizes a single annotated pre-trained diffusion model.
arXiv Detail & Related papers (2024-12-01T14:09:00Z)
Public Computer Vision Datasets for Precision Livestock Farming: A Systematic Survey [3.3651853492305177]
This study presents the first systematic survey of publicly available livestock CV datasets. Among 58 public datasets identified and analyzed, almost half of them are for cattle, followed by swine, poultry, and other animals. Individual animal detection and color imaging are the dominant application and imaging modality for livestock.
arXiv Detail & Related papers (2024-06-15T13:22:41Z)
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping [22.30038765017189]
We propose a metadata-aware self-supervised learning(SSL) framework useful for fine-grained classification and ecological mapping of bird species around the world. Our framework unifies two SSL strategies: Contrastive Learning(CL) and Masked Image Modeling(MIM), while also enriching the embedding space with metadata available with ground-level imagery of birds. We demonstrate that our models learn fine-grained and geographically conditioned features of birds, by evaluating on two downstream tasks: fine-grained visual classification(FGVC) and cross-modal retrieval.
arXiv Detail & Related papers (2023-10-29T22:08:00Z)
SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot Neural Sparse Retrieval [92.27387459751309]
We provide SPRINT, a unified Python toolkit for evaluating neural sparse retrieval. We establish strong and reproducible zero-shot sparse retrieval baselines across the well-acknowledged benchmark, BEIR. We show that SPLADEv2 produces sparse representations with a majority of tokens outside of the original query and document.
arXiv Detail & Related papers (2023-07-19T22:48:02Z)
infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization. infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information. In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z)
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images. We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities. The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z)
Towards Individual Grevy's Zebra Identification via Deep 3D Fitting and Metric Learning [2.004276260443012]
This paper combines deep learning techniques for species detection, 3D model fitting, and metric learning in one pipeline to perform individual animal identification. We show in a small study on the SMALST dataset that the use of 3D model fitting can indeed benefit performance. Back-projected textures from 3D fitted models improve identification accuracy from 48.0% to 56.8% compared to 2D bounding box approaches.
arXiv Detail & Related papers (2022-06-05T20:44:54Z)
Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans [103.92680099373567]
This paper introduces a pipeline to parametrically sample and render multi-task vision datasets from comprehensive 3D scans from the real world. Changing the sampling parameters allows one to "steer" the generated datasets to emphasize specific information. Common architectures trained on a generated starter dataset reached state-of-the-art performance on multiple common vision tasks and benchmarks.
arXiv Detail & Related papers (2021-10-11T04:21:46Z)
Unsupervised Domain Adaptive Learning via Synthetic Data for Person Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance. Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models. In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.