How many images do I need? Understanding how sample size per class
affects deep learning model performance metrics for balanced designs in
autonomous wildlife monitoring
- URL: http://arxiv.org/abs/2010.08186v1
- Date: Fri, 16 Oct 2020 06:28:35 GMT
- Title: How many images do I need? Understanding how sample size per class
affects deep learning model performance metrics for balanced designs in
autonomous wildlife monitoring
- Authors: Saleh Shahinfar, Paul Meek, Greg Falzon
- Abstract summary: We explore in depth the issues of deep learning model performance for progressively increasing per class (species) sample sizes.
We provide ecologists with an approximation formula to estimate how many images per animal species they need for certain accuracy level a priori.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning (DL) algorithms are the state of the art in automated
classification of wildlife camera trap images. The challenge is that the
ecologist cannot know in advance how many images per species they need to
collect for model training in order to achieve their desired classification
accuracy. In fact there is limited empirical evidence in the context of camera
trapping to demonstrate that increasing sample size will lead to improved
accuracy. In this study we explore in depth the issues of deep learning model
performance for progressively increasing per class (species) sample sizes. We
also provide ecologists with an approximation formula to estimate how many
images per animal species they need for certain accuracy level a priori. This
will help ecologists for optimal allocation of resources, work and efficient
study design. In order to investigate the effect of number of training images;
seven training sets with 10, 20, 50, 150, 500, 1000 images per class were
designed. Six deep learning architectures namely ResNet-18, ResNet-50,
ResNet-152, DnsNet-121, DnsNet-161, and DnsNet-201 were trained and tested on a
common exclusive testing set of 250 images per class. The whole experiment was
repeated on three similar datasets from Australia, Africa and North America and
the results were compared. Simple regression equations for use by practitioners
to approximate model performance metrics are provided. Generalized additive
models (GAM) are shown to be effective in modelling DL performance metrics
based on the number of training images per class, tuning scheme and dataset.
Key-words: Camera Traps, Deep Learning, Ecological Informatics, Generalised
Additive Models, Learning Curves, Predictive Modelling, Wildlife.
Related papers
- Pushing Boundaries: Exploring Zero Shot Object Classification with Large
Multimodal Models [0.09264362806173355]
Large Language and Vision Assistant models (LLVAs) engage users in rich conversational experiences intertwined with image-based queries.
This paper takes a unique perspective on LMMs, exploring their efficacy in performing image classification tasks using tailored prompts.
Our study includes a benchmarking analysis across four diverse datasets: MNIST, Cats Vs. Dogs, Hymnoptera (Ants Vs. Bees), and an unconventional dataset comprising Pox Vs. Non-Pox skin images.
arXiv Detail & Related papers (2023-12-30T03:19:54Z) - Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition.
Specifically, we utilize the web-collected Coyo-700M dataset.
Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z) - The effectiveness of MAE pre-pretraining for billion-scale pretraining [65.98338857597935]
We introduce an additional pre-pretraining stage that is simple and uses the self-supervised MAE technique to initialize the model.
We measure the effectiveness of pre-pretraining on 10 different visual recognition tasks spanning image classification, video recognition, object detection, low-shot classification and zero-shot recognition.
arXiv Detail & Related papers (2023-03-23T17:56:12Z) - Facilitated machine learning for image-based fruit quality assessment in
developing countries [68.8204255655161]
Automated image classification is a common task for supervised machine learning in food science.
We propose an alternative method based on pre-trained vision transformers (ViTs)
It can be easily implemented with limited resources on a standard device.
arXiv Detail & Related papers (2022-07-10T19:52:20Z) - KNN-Diffusion: Image Generation via Large-Scale Retrieval [40.6656651653888]
Learning to adapt enables several new capabilities.
Fine-tuning trained models to new samples can be achieved by simply adding them to the table.
Our diffusion-based model trains on images only, by leveraging a joint Text-Image multi-modal metric.
arXiv Detail & Related papers (2022-04-06T14:13:35Z) - Choosing an Appropriate Platform and Workflow for Processing Camera Trap
Data using Artificial Intelligence [0.18350044465969417]
Camera traps have transformed how ecologists study wildlife species distributions, activity patterns, and interspecific interactions.
The potential of Artificial Intelligence (AI), specifically Deep Learning (DL), to process camera-trap data has gained considerable attention.
Using DL for these applications involves training algorithms, such as Convolutional Neural Networks (CNNs) to automatically detect objects and classify species.
arXiv Detail & Related papers (2022-02-04T18:13:09Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Deep Low-Shot Learning for Biological Image Classification and
Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared.
labeling training data with precise stages is very time-consuming even for biologists.
We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z) - Background Splitting: Finding Rare Classes in a Sea of Background [55.03789745276442]
We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories.
In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background)
We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance.
arXiv Detail & Related papers (2020-08-28T23:05:15Z) - Multi-task pre-training of deep neural networks for digital pathology [8.74883469030132]
We first assemble and transform many digital pathology datasets into a pool of 22 classification tasks and almost 900k images.
We show that our models used as feature extractors either improve significantly over ImageNet pre-trained models or provide comparable performance.
arXiv Detail & Related papers (2020-05-05T08:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.