Related papers: EPANet: Efficient Path Aggregation Network for Underwater Fish Detection

EPANet: Efficient Path Aggregation Network for Underwater Fish Detection

URL: http://arxiv.org/abs/2508.00528v1
Date: Fri, 01 Aug 2025 11:09:13 GMT
Title: EPANet: Efficient Path Aggregation Network for Underwater Fish Detection
Authors: Jinsong Yang, Zeyuan Hu, Yichen Li,
Abstract summary: We propose an efficient path aggregation network (EPANet) for underwater fish detection (UFD)<n>EPANet consists of two key components: an efficient path aggregation feature pyramid network (EPA-FPN) and a multi-scale diverse-division short path bottleneck (MS-DDSP bottleneck)<n> experiments on benchmark UFD datasets demonstrate that EPANet outperforms state-of-the-art methods in terms of detection accuracy and inference speed.
Score: 6.6069949373696994
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Underwater fish detection (UFD) remains a challenging task in computer vision due to low object resolution, significant background interference, and high visual similarity between targets and surroundings. Existing approaches primarily focus on local feature enhancement or incorporate complex attention mechanisms to highlight small objects, often at the cost of increased model complexity and reduced efficiency. To address these limitations, we propose an efficient path aggregation network (EPANet), which leverages complementary feature integration to achieve accurate and lightweight UFD. EPANet consists of two key components: an efficient path aggregation feature pyramid network (EPA-FPN) and a multi-scale diverse-division short path bottleneck (MS-DDSP bottleneck). The EPA-FPN introduces long-range skip connections across disparate scales to improve semantic-spatial complementarity, while cross-layer fusion paths are adopted to enhance feature integration efficiency. The MS-DDSP bottleneck extends the conventional bottleneck structure by introducing finer-grained feature division and diverse convolutional operations, thereby increasing local feature diversity and representation capacity. Extensive experiments on benchmark UFD datasets demonstrate that EPANet outperforms state-of-the-art methods in terms of detection accuracy and inference speed, while maintaining comparable or even lower parameter complexity.

Related papers

Residual Prior-driven Frequency-aware Network for Image Fusion [6.90874640835234]
Image fusion aims to integrate complementary information across modalities to generate high-quality fused images.<n>We propose a Residual Prior-driven Frequency-aware Network, termed as RPFNet.
arXiv Detail & Related papers (2025-07-09T10:48:00Z)
SDS-Net: Shallow-Deep Synergism-detection Network for infrared small target detection [0.18641315013048293]
Current CNN-based infrared small target detection methods overlook the heterogeneity between shallow and deep features.<n>The dependency relationships and fusion mechanisms fail to fully exploit the complementarity of multilevel features.<n>This paper proposes a shallow-deep synergistic detection network (SDS-Net) that efficiently models multilevel feature representations.
arXiv Detail & Related papers (2025-06-06T12:44:41Z)
Efficient Feature Fusion for UAV Object Detection [9.632727117779178]
Small objects, in particular, occupy small portions of images, making their accurate detection difficult.<n>Existing multi-scale feature fusion methods address these challenges by aggregating features across different resolutions.<n>We propose a novel feature fusion framework specifically designed for UAV object detection tasks.
arXiv Detail & Related papers (2025-01-29T20:39:16Z)
PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN) PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z)
ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection [65.59969454655996]
We propose an efficient change detection framework, ELGC-Net, which leverages rich contextual information to precisely estimate change regions. Our proposed ELGC-Net sets a new state-of-the-art performance in remote sensing change detection benchmarks. We also introduce ELGC-Net-LW, a lighter variant with significantly reduced computational complexity, suitable for resource-constrained settings.
arXiv Detail & Related papers (2024-03-26T17:46:25Z)
DRSI-Net: Dual-Residual Spatial Interaction Network for Multi-Person Pose Estimation [7.418828517897727]
We propose a dual-residual spatial interaction network (DRSI-Net) for MPPE with high accuracy and low complexity. Compared to other methods, DRSI-Net performs residual spatial information interactions on the neighbouring features. The proposed DRSI-Net outperforms other state-of-the-art methods in accuracy and complexity.
arXiv Detail & Related papers (2024-02-26T15:10:22Z)
Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning. CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z)
Transformer-based Context Condensation for Boosting Feature Pyramids in Object Detection [77.50110439560152]
Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF) We propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results. In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency.
arXiv Detail & Related papers (2022-07-14T01:45:03Z)
Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process. The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z)
Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs. In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations. High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z)
Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection. The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.