An interpretable framework using foundation models for fish sex identification
- URL: http://arxiv.org/abs/2602.19022v1
- Date: Sun, 22 Feb 2026 03:21:26 GMT
- Title: An interpretable framework using foundation models for fish sex identification
- Authors: Zheng Miao, Tien-Chieh Hung,
- Abstract summary: We propose FishProtoNet, a non-invasive computer vision-based framework for sex identification of delta smelt (Hypomesus transpacificus)<n>FishProtoNet provides interpretability through learned prototype representations while improving robustness by leveraging foundation models to reduce the influence of background noise.<n>FishProtoNet demonstrates strong performance in delta smelt sex identification during early spawning and post-spawning stages.
- Score: 0.3867363075280543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate sex identification in fish is vital for optimizing breeding and management strategies in aquaculture, particularly for species at the risk of extinction. However, most existing methods are invasive or stressful and may cause additional mortality, posing severe risks to threatened or endangered fish populations. To address these challenges, we propose FishProtoNet, a robust, non-invasive computer vision-based framework for sex identification of delta smelt (Hypomesus transpacificus), an endangered fish species native to California, across its full life cycle. Unlike the traditional deep learning methods, FishProtoNet provides interpretability through learned prototype representations while improving robustness by leveraging foundation models to reduce the influence of background noise. Specifically, the FishProtoNet framework consists of three key components: fish regions of interest (ROIs) extraction using visual foundation model, feature extraction from fish ROIs and fish sex identification based on an interpretable prototype network. FishProtoNet demonstrates strong performance in delta smelt sex identification during early spawning and post-spawning stages, achieving the accuracies of 74.40% and 81.16% and corresponding F1 scores of 74.27% and 79.43% respectively. In contrast, delta smelt sex identification at the subadult stage remains challenging for current computer vision methods, likely due to less pronounced morphological differences in immature fish. The source code of FishProtoNet is publicly available at: https://github.com/zhengmiao1/Fish_sex_identification
Related papers
- Jellyfish Species Identification: A CNN Based Artificial Neural Network Approach [0.0]
Jellyfish play a crucial role in maintaining marine ecosystems but pose significant challenges for biodiversity and conservation.<n>In this study, we proposed a deep learning framework for jellyfish species detection and classification using an underwater image dataset.
arXiv Detail & Related papers (2025-07-15T09:10:36Z) - Flatfish Lesion Detection Based on Part Segmentation Approach and Lesion Image Generation [0.5937476291232799]
The flatfish is a major farmed species consumed globally in large quantities.<n>Traditionally, lesions were detected through visual inspection, but observing large numbers of fish is challenging.<n>This study augments fish lesion images using generative adversarial networks and image harmonization methods.
arXiv Detail & Related papers (2024-07-16T03:32:10Z) - RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation [88.54817424560056]
We propose a distortion vector map (DVM) that measures the degree and direction of local distortion.
By learning the DVM, the model can independently identify local distortions at each pixel without relying on global distortion patterns.
In the pre-training stage, it predicts the distortion vector map and perceives the local distortion features of each pixel.
In the fine-tuning stage, it predicts a pixel-wise flow map for deviated fisheye image rectification.
arXiv Detail & Related papers (2024-06-27T06:38:56Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
Cross-modality person re-identification (ReID) systems are based on RGB images.<n>Main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.<n>Existing attack methods have primarily focused on the characteristics of the visible image modality.<n>This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based
Motion Refinement [65.08165593201437]
We explore egocentric whole-body motion capture using a single fisheye camera, which simultaneously estimates human body and hand motion.
This task presents significant challenges due to the lack of high-quality datasets, fisheye camera distortion, and human body self-occlusion.
We propose a novel approach that leverages FisheyeViT to extract fisheye image features, which are converted into pixel-aligned 3D heatmap representations for 3D human body pose prediction.
arXiv Detail & Related papers (2023-11-28T07:13:47Z) - TempNet: Temporal Attention Towards the Detection of Animal Behaviour in
Videos [63.85815474157357]
We propose an efficient computer vision- and deep learning-based method for the detection of biological behaviours in videos.
TempNet uses an encoder bridge and residual blocks to maintain model performance with a two-staged, spatial, then temporal, encoder.
We demonstrate its application to the detection of sablefish (Anoplopoma fimbria) startle events.
arXiv Detail & Related papers (2022-11-17T23:55:12Z) - Fish Disease Detection Using Image Based Machine Learning Technique in
Aquaculture [0.971137838903781]
Fish diseases in aquaculture constitute a significant hazard to nutriment security.
Image pre-processing and segmentation have been applied to reduce noise and exaggerate the image.
In the second portion, we extract the involved features to classify the diseases with the help of the Support Vector Machine (SVM) algorithm of machine learning.
arXiv Detail & Related papers (2021-05-09T13:22:44Z) - ES-Net: Erasing Salient Parts to Learn More in Re-Identification [46.624740579314924]
We propose a novel network, Erasing-Salient Net (ES-Net), to learn comprehensive features by erasing the salient areas in an image.
Our ES-Net outperforms state-of-the-art methods on three Person re-ID benchmarks and two Vehicle re-ID benchmarks.
arXiv Detail & Related papers (2021-03-10T08:19:46Z) - Movement Tracks for the Automatic Detection of Fish Behavior in Videos [63.85815474157357]
We offer a dataset of sablefish (Anoplopoma fimbria) startle behaviors in underwater videos, and investigate the use of deep learning (DL) methods for behavior detection on it.
Our proposed detection system identifies fish instances using DL-based frameworks, determines trajectory tracks, derives novel behavior-specific features, and employs Long Short-Term Memory (LSTM) networks to identify startle behavior in sablefish.
arXiv Detail & Related papers (2020-11-28T05:51:19Z) - FishNet: A Unified Embedding for Salmon Recognition [0.37798600249187286]
We propose FishNet, based on a deep learning technique that has been successfully used for identifying humans.
Our experiments show that this architecture learns a useful representation based on images of salmon heads.
FishNet achieves a false positive rate of 1% and a true positive rate of 96%.
arXiv Detail & Related papers (2020-10-20T17:35:01Z) - Temperate Fish Detection and Classification: a Deep Learning based
Approach [6.282069822653608]
We propose a two-step deep learning approach for the detection and classification of temperate fishes without pre-filtering.
The first step is to detect each single fish in an image, independent of species and sex.
In the second step, we adopt a Convolutional Neural Network (CNN) with the Squeeze-and-Excitation (SE) architecture for classifying each fish in the image without pre-filtering.
arXiv Detail & Related papers (2020-05-14T12:40:57Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.