AutoFish: Dataset and Benchmark for Fine-grained Analysis of Fish
- URL: http://arxiv.org/abs/2501.03767v1
- Date: Tue, 07 Jan 2025 13:14:25 GMT
- Title: AutoFish: Dataset and Benchmark for Fine-grained Analysis of Fish
- Authors: Stefan Hein Bengtson, Daniel Lehotský, Vasiliki Ismiroglou, Niels Madsen, Thomas B. Moeslund, Malte Pedersen,
- Abstract summary: The dataset comprises 1,500 images of 454 specimens of visually similar fish placed in various constellations on a white conveyor belt.
The data was collected in a controlled environment using an RGB camera.
We establish baseline instance segmentation results using two variations of the Mask2Former architecture.
- Score: 19.025566399187547
- License:
- Abstract: Automated fish documentation processes are in the near future expected to play an essential role in sustainable fisheries management and for addressing challenges of overfishing. In this paper, we present a novel and publicly available dataset named AutoFish designed for fine-grained fish analysis. The dataset comprises 1,500 images of 454 specimens of visually similar fish placed in various constellations on a white conveyor belt and annotated with instance segmentation masks, IDs, and length measurements. The data was collected in a controlled environment using an RGB camera. The annotation procedure involved manual point annotations, initial segmentation masks proposed by the Segment Anything Model (SAM), and subsequent manual correction of the masks. We establish baseline instance segmentation results using two variations of the Mask2Former architecture, with the best performing model reaching an mAP of 89.15%. Additionally, we present two baseline length estimation methods, the best performing being a custom MobileNetV2-based regression model reaching an MAE of 0.62cm in images with no occlusion and 1.38cm in images with occlusion. Link to project page: https://vap.aau.dk/autofish/.
Related papers
- Counting Fish with Temporal Representations of Sonar Video [15.713015426791221]
We propose an alternative lightweight computer vision method for fish counting based on analyzing echograms.
We achieve a count error of 23% on representative data from the Kenai River in Alaska, demonstrating the feasibility of our approach.
arXiv Detail & Related papers (2025-02-07T18:02:28Z) - GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model [66.35608254724566]
State-space models (SSMs) have showcased effective performance in modeling long-range dependencies with subquadratic complexity.
However, pure SSM-based models still face challenges related to stability and achieving optimal performance on computer vision tasks.
Our paper addresses the challenges of scaling SSM-based models for computer vision, particularly the instability and inefficiency of large model sizes.
arXiv Detail & Related papers (2024-07-18T17:59:58Z) - A Fair Ranking and New Model for Panoptic Scene Graph Generation [51.78798765130832]
Decoupled SceneFormer (DSFormer) is a novel two-stage model that outperforms all existing scene graph models.
As a core design principle, DSFormer encodes subject and object masks directly into feature space.
arXiv Detail & Related papers (2024-07-12T12:28:08Z) - FishNet: Deep Neural Networks for Low-Cost Fish Stock Estimation [0.0]
FishNet is an automated computer vision system for both taxonomic classification and fish size estimation.
We use a dataset of 300,000 hand-labeled images containing 1.2M fish of 163 different species.
FishNet achieves a 92% intersection over union on the fish segmentation task, a 89% top-1 classification accuracy on single fish species classification, and a 2.3cm mean absolute error on the fish length estimation task.
arXiv Detail & Related papers (2024-03-16T12:44:08Z) - BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model [65.92173280096588]
We address the challenge of image resolution variation for the Segment Anything Model (SAM)
SAM, known for its zero-shot generalizability, exhibits a performance degradation when faced with datasets with varying image sizes.
We present a bias-mode attention mask that allows each token to prioritize neighboring information.
arXiv Detail & Related papers (2024-01-04T15:34:44Z) - Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition.
Specifically, we utilize the web-collected Coyo-700M dataset.
Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z) - CAE v2: Context Autoencoder with CLIP Target [63.61868058214267]
Masked image modeling (MIM) learns visual representation by masking and reconstructing image patches.
Applying the reconstruction supervision on the CLIP representation has been proven effective for MIM.
To investigate strategies for refining the CLIP-targeted MIM, we study two critical elements in MIM, i.e., the supervision position and the mask ratio.
arXiv Detail & Related papers (2022-11-17T18:58:33Z) - Scaling up instance annotation via label propagation [69.8001043244044]
We propose a highly efficient annotation scheme for building large datasets with object segmentation masks.
We exploit these similarities by using hierarchical clustering on mask predictions made by a segmentation model.
We show that we obtain 1M object segmentation masks with a total annotation time of only 290 hours.
arXiv Detail & Related papers (2021-10-05T18:29:34Z) - Affinity LCFCN: Learning to Segment Fish with Weak Supervision [15.245008639754328]
We propose an automatic segmentation model efficiently trained on images labeled with only point-level supervision.
Our approach uses a fully convolutional neural network with one branch that outputs per-pixel scores and another that outputs an affinity matrix.
We validate our model on the DeepFish dataset, which contains many fish habitats from the north-eastern Australian region.
arXiv Detail & Related papers (2020-11-06T00:33:20Z) - Counting Fish and Dolphins in Sonar Images Using Deep Learning [0.40611352512781856]
Current methods of fish and dolphin abundance estimates are performed by on-site sampling using visual and capture/release strategies.
We propose a novel approach to calculating fish abundance using deep learning for fish and dolphin estimates from sonar images taken from the back of a trolling boat.
arXiv Detail & Related papers (2020-07-24T23:52:03Z) - Temperate Fish Detection and Classification: a Deep Learning based
Approach [6.282069822653608]
We propose a two-step deep learning approach for the detection and classification of temperate fishes without pre-filtering.
The first step is to detect each single fish in an image, independent of species and sex.
In the second step, we adopt a Convolutional Neural Network (CNN) with the Squeeze-and-Excitation (SE) architecture for classifying each fish in the image without pre-filtering.
arXiv Detail & Related papers (2020-05-14T12:40:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.