Rare Wildlife Recognition with Self-Supervised Representation Learning
- URL: http://arxiv.org/abs/2211.05636v1
- Date: Sat, 29 Oct 2022 17:57:38 GMT
- Title: Rare Wildlife Recognition with Self-Supervised Representation Learning
- Authors: Xiaochen Zheng
- Abstract summary: We present a methodology to reduce the amount of required training data by resorting to self-supervised pretraining.
We show that a combination of MoCo, CLD, and geometric augmentations outperforms conventional models pretrained on ImageNet by a large margin.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated animal censuses with aerial imagery are a vital ingredient towards
wildlife conservation. Recent models are generally based on supervised learning
and thus require vast amounts of training data. Due to their scarcity and
minuscule size, annotating animals in aerial imagery is a highly tedious
process. In this project, we present a methodology to reduce the amount of
required training data by resorting to self-supervised pretraining. In detail,
we examine a combination of recent contrastive learning methodologies like
Momentum Contrast (MoCo) and Cross-Level Instance-Group Discrimination (CLD) to
condition our model on the aerial images without the requirement for labels. We
show that a combination of MoCo, CLD, and geometric augmentations outperforms
conventional models pretrained on ImageNet by a large margin. Meanwhile,
strategies for smoothing label or prediction distribution in supervised
learning have been proven useful in preventing the model from overfitting. We
combine the self-supervised contrastive models with image mixup strategies and
find that it is useful for learning more robust visual representations.
Crucially, our methods still yield favorable results even if we reduce the
number of training animals to just 10%, at which point our best model scores
double the recall of the baseline at similar precision. This effectively allows
reducing the number of required annotations to a fraction while still being
able to train high-accuracy models in such highly challenging settings.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages.
In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z) - EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training [79.96741042766524]
We reformulate the training curriculum as a soft-selection function.
We show that exposing the contents of natural images can be readily achieved by the intensity of data augmentation.
The resulting method, EfficientTrain++, is simple, general, yet surprisingly effective.
arXiv Detail & Related papers (2024-05-14T17:00:43Z) - No Data Augmentation? Alternative Regularizations for Effective Training
on Small Datasets [0.0]
We study alternative regularization strategies to push the limits of supervised learning on small image classification datasets.
In particular, we employ a agnostic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters.
We reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
arXiv Detail & Related papers (2023-09-04T16:13:59Z) - Bag of Tricks for Long-Tail Visual Recognition of Animal Species in
Camera Trap Images [2.294014185517203]
We evaluate recently proposed techniques to address the long-tail visual recognition of animal species in camera trap images.
In general, the square-root sampling was the method that most improved the performance for minority classes by around 10%.
The proposed approach achieved the best trade-off between the performance of the tail class and the cost of the head classes' accuracy.
arXiv Detail & Related papers (2022-06-24T18:30:26Z) - Self-Supervised Pretraining and Controlled Augmentation Improve Rare
Wildlife Recognition in UAV Images [9.220908533011068]
We present a methodology to reduce the amount of required training data by resorting to self-supervised pretraining.
We show that a combination of MoCo, CLD, and geometric augmentations outperforms conventional models pre-trained on ImageNet by a large margin.
arXiv Detail & Related papers (2021-08-17T12:14:28Z) - Zoo-Tuning: Adaptive Transfer from a Zoo of Models [82.9120546160422]
Zoo-Tuning learns to adaptively transfer the parameters of pretrained models to the target task.
We evaluate our approach on a variety of tasks, including reinforcement learning, image classification, and facial landmark detection.
arXiv Detail & Related papers (2021-06-29T14:09:45Z) - Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution.
This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes.
Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z) - Deep learning with self-supervision and uncertainty regularization to
count fish in underwater images [28.261323753321328]
Effective conservation actions require effective population monitoring.
Monitoring populations through image sampling has made data collection cheaper, wide-reaching and less intrusive.
Counting animals from such data is challenging, particularly when densely packed in noisy images.
Deep learning is the state-of-the-art method for many computer vision tasks, but it has yet to be properly explored to count animals.
arXiv Detail & Related papers (2021-04-30T13:02:19Z) - More Photos are All You Need: Semi-Supervised Learning for Fine-Grained
Sketch Based Image Retrieval [112.1756171062067]
We introduce a novel semi-supervised framework for cross-modal retrieval.
At the centre of our design is a sequential photo-to-sketch generation model.
We also introduce a discriminator guided mechanism to guide against unfaithful generation.
arXiv Detail & Related papers (2021-03-25T17:27:08Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.