Deep learning with self-supervision and uncertainty regularization to
count fish in underwater images
- URL: http://arxiv.org/abs/2104.14964v1
- Date: Fri, 30 Apr 2021 13:02:19 GMT
- Title: Deep learning with self-supervision and uncertainty regularization to
count fish in underwater images
- Authors: Penny Tarling, Mauricio Cantor, Albert Clap\'es and Sergio Escalera
- Abstract summary: Effective conservation actions require effective population monitoring.
Monitoring populations through image sampling has made data collection cheaper, wide-reaching and less intrusive.
Counting animals from such data is challenging, particularly when densely packed in noisy images.
Deep learning is the state-of-the-art method for many computer vision tasks, but it has yet to be properly explored to count animals.
- Score: 28.261323753321328
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Effective conservation actions require effective population monitoring.
However, accurately counting animals in the wild to inform conservation
decision-making is difficult. Monitoring populations through image sampling has
made data collection cheaper, wide-reaching and less intrusive but created a
need to process and analyse this data efficiently. Counting animals from such
data is challenging, particularly when densely packed in noisy images.
Attempting this manually is slow and expensive, while traditional computer
vision methods are limited in their generalisability. Deep learning is the
state-of-the-art method for many computer vision tasks, but it has yet to be
properly explored to count animals. To this end, we employ deep learning, with
a density-based regression approach, to count fish in low-resolution sonar
images. We introduce a large dataset of sonar videos, deployed to record wild
mullet schools (Mugil liza), with a subset of 500 labelled images. We utilise
abundant unlabelled data in a self-supervised task to improve the supervised
counting task. For the first time in this context, by introducing uncertainty
quantification, we improve model training and provide an accompanying measure
of prediction uncertainty for more informed biological decision-making.
Finally, we demonstrate the generalisability of our proposed counting framework
through testing it on a recent benchmark dataset of high-resolution annotated
underwater images from varying habitats (DeepFish). From experiments on both
contrasting datasets, we demonstrate our network outperforms the few other deep
learning models implemented for solving this task. By providing an open-source
framework along with training data, our study puts forth an efficient deep
learning template for crowd counting aquatic animals thereby contributing
effective methods to assess natural populations from the ever-increasing visual
data.
Related papers
- Multimodal Foundation Models for Zero-shot Animal Species Recognition in
Camera Trap Images [57.96659470133514]
Motion-activated camera traps constitute an efficient tool for tracking and monitoring wildlife populations across the globe.
Supervised learning techniques have been successfully deployed to analyze such imagery, however training such techniques requires annotations from experts.
Reducing the reliance on costly labelled data has immense potential in developing large-scale wildlife tracking solutions with markedly less human labor.
arXiv Detail & Related papers (2023-11-02T08:32:00Z) - Automated wildlife image classification: An active learning tool for
ecological applications [0.44970015278813025]
Wildlife camera trap images are being used extensively to investigate animal abundance, habitat associations, and behavior.
Artificial intelligence systems can take over this task but usually need a large number of already-labeled training images to achieve sufficient performance.
We propose a label-efficient learning strategy that enables researchers with small or medium-sized image databases to leverage the potential of modern machine learning.
arXiv Detail & Related papers (2023-03-28T08:51:15Z) - Rare Wildlife Recognition with Self-Supervised Representation Learning [0.0]
We present a methodology to reduce the amount of required training data by resorting to self-supervised pretraining.
We show that a combination of MoCo, CLD, and geometric augmentations outperforms conventional models pretrained on ImageNet by a large margin.
arXiv Detail & Related papers (2022-10-29T17:57:38Z) - Curious Representation Learning for Embodied Intelligence [81.21764276106924]
Self-supervised representation learning has achieved remarkable success in recent years.
Yet to build truly intelligent agents, we must construct representation learning algorithms that can learn from environments.
We propose a framework, curious representation learning, which jointly learns a reinforcement learning policy and a visual representation model.
arXiv Detail & Related papers (2021-05-03T17:59:20Z) - Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only.
We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z) - Unifying data for fine-grained visual species classification [15.14767769034929]
We present an initial deep convolutional neural network model, trained on 2.9M images across 465 fine-grained species.
The long-term goal is to enable scientists to make conservation recommendations from near real-time analysis of species abundance and population health.
arXiv Detail & Related papers (2020-09-24T01:04:18Z) - A Realistic Fish-Habitat Dataset to Evaluate Algorithms for Underwater
Visual Analysis [2.6476746128312194]
We present DeepFish as a benchmark suite with a large-scale dataset to train and test methods for several computer vision tasks.
The dataset consists of approximately 40 thousand images collected underwater from 20 greenhabitats in the marine-environments of tropical Australia.
Our experiments provide an in-depth analysis of the dataset characteristics, and the performance evaluation of several state-of-the-art approaches.
arXiv Detail & Related papers (2020-08-28T12:20:59Z) - Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences
for Urban Scene Segmentation [57.68890534164427]
In this work, we ask if we may leverage semi-supervised learning in unlabeled video sequences and extra images to improve the performance on urban scene segmentation.
We simply predict pseudo-labels for the unlabeled data and train subsequent models with both human-annotated and pseudo-labeled data.
Our Naive-Student model, trained with such simple yet effective iterative semi-supervised learning, attains state-of-the-art results at all three Cityscapes benchmarks.
arXiv Detail & Related papers (2020-05-20T18:00:05Z) - Temperate Fish Detection and Classification: a Deep Learning based
Approach [6.282069822653608]
We propose a two-step deep learning approach for the detection and classification of temperate fishes without pre-filtering.
The first step is to detect each single fish in an image, independent of species and sex.
In the second step, we adopt a Convolutional Neural Network (CNN) with the Squeeze-and-Excitation (SE) architecture for classifying each fish in the image without pre-filtering.
arXiv Detail & Related papers (2020-05-14T12:40:57Z) - Neural Networks Are More Productive Teachers Than Human Raters: Active
Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model [57.41841346459995]
We study how to train a student deep neural network for visual recognition by distilling knowledge from a blackbox teacher model in a data-efficient manner.
We propose an approach that blends mixup and active learning.
arXiv Detail & Related papers (2020-03-31T05:44:55Z) - Laplacian Denoising Autoencoder [114.21219514831343]
We propose to learn data representations with a novel type of denoising autoencoder.
The noisy input data is generated by corrupting latent clean data in the gradient domain.
Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach.
arXiv Detail & Related papers (2020-03-30T16:52:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.