WildlifeDatasets: An open-source toolkit for animal re-identification
- URL: http://arxiv.org/abs/2311.09118v2
- Date: Thu, 14 Dec 2023 08:04:16 GMT
- Title: WildlifeDatasets: An open-source toolkit for animal re-identification
- Authors: Vojt\v{e}ch \v{C}erm\'ak, Lukas Picek, Luk\'a\v{s} Adam, Kostas
Papafitsoros
- Abstract summary: WildlifeDatasets is an open-source toolkit for ecologists and computer-vision / machine-learning researchers.
WildlifeDatasets is written in Python and allows straightforward access to publicly available wildlife datasets.
We provide the first-ever foundation model for individual re-identification within a wide range of species - MegaDescriptor.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we present WildlifeDatasets
(https://github.com/WildlifeDatasets/wildlife-datasets) - an open-source
toolkit intended primarily for ecologists and computer-vision /
machine-learning researchers. The WildlifeDatasets is written in Python, allows
straightforward access to publicly available wildlife datasets, and provides a
wide variety of methods for dataset pre-processing, performance analysis, and
model fine-tuning. We showcase the toolkit in various scenarios and baseline
experiments, including, to the best of our knowledge, the most comprehensive
experimental comparison of datasets and methods for wildlife re-identification,
including both local descriptors and deep learning approaches. Furthermore, we
provide the first-ever foundation model for individual re-identification within
a wide range of species - MegaDescriptor - that provides state-of-the-art
performance on animal re-identification datasets and outperforms other
pre-trained models such as CLIP and DINOv2 by a significant margin. To make the
model available to the general public and to allow easy integration with any
existing wildlife monitoring applications, we provide multiple MegaDescriptor
flavors (i.e., Small, Medium, and Large) through the HuggingFace hub
(https://huggingface.co/BVRA).
Related papers
- Public Computer Vision Datasets for Precision Livestock Farming: A Systematic Survey [3.3651853492305177]
This study presents the first systematic survey of publicly available livestock CV datasets.
Among 58 public datasets identified and analyzed, almost half of them are for cattle, followed by swine, poultry, and other animals.
Individual animal detection and color imaging are the dominant application and imaging modality for livestock.
arXiv Detail & Related papers (2024-06-15T13:22:41Z) - BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species
Classification and Mapping [22.30038765017189]
We propose a metadata-aware self-supervised learning(SSL) framework useful for fine-grained classification and ecological mapping of bird species around the world.
Our framework unifies two SSL strategies: Contrastive Learning(CL) and Masked Image Modeling(MIM), while also enriching the embedding space with metadata available with ground-level imagery of birds.
We demonstrate that our models learn fine-grained and geographically conditioned features of birds, by evaluating on two downstream tasks: fine-grained visual classification(FGVC) and cross-modal retrieval.
arXiv Detail & Related papers (2023-10-29T22:08:00Z) - SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot
Neural Sparse Retrieval [92.27387459751309]
We provide SPRINT, a unified Python toolkit for evaluating neural sparse retrieval.
We establish strong and reproducible zero-shot sparse retrieval baselines across the well-acknowledged benchmark, BEIR.
We show that SPLADEv2 produces sparse representations with a majority of tokens outside of the original query and document.
arXiv Detail & Related papers (2023-07-19T22:48:02Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Towards Individual Grevy's Zebra Identification via Deep 3D Fitting and
Metric Learning [2.004276260443012]
This paper combines deep learning techniques for species detection, 3D model fitting, and metric learning in one pipeline to perform individual animal identification.
We show in a small study on the SMALST dataset that the use of 3D model fitting can indeed benefit performance.
Back-projected textures from 3D fitted models improve identification accuracy from 48.0% to 56.8% compared to 2D bounding box approaches.
arXiv Detail & Related papers (2022-06-05T20:44:54Z) - Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision
Datasets from 3D Scans [103.92680099373567]
This paper introduces a pipeline to parametrically sample and render multi-task vision datasets from comprehensive 3D scans from the real world.
Changing the sampling parameters allows one to "steer" the generated datasets to emphasize specific information.
Common architectures trained on a generated starter dataset reached state-of-the-art performance on multiple common vision tasks and benchmarks.
arXiv Detail & Related papers (2021-10-11T04:21:46Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.