Lightweight Multi-Scale Feature Extraction with Fully Connected LMF Layer for Salient Object Detection
- URL: http://arxiv.org/abs/2508.07170v1
- Date: Sun, 10 Aug 2025 04:06:48 GMT
- Title: Lightweight Multi-Scale Feature Extraction with Fully Connected LMF Layer for Salient Object Detection
- Authors: Yunpeng Shi, Lei Chen, Xiaolu Shen, Yanju Guo,
- Abstract summary: This paper proposes a novel lightweight multi-scale feature extraction layer, termed the LMF layer.<n>By integrating multiple LMF layers, we develop LMFNet, a lightweight network tailored for salient object detection.<n>We show that LMFNet achieves state-of-the-art or comparable results on five benchmark datasets with only 0.81M parameters.
- Score: 8.924457924091408
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the domain of computer vision, multi-scale feature extraction is vital for tasks such as salient object detection. However, achieving this capability in lightweight networks remains challenging due to the trade-off between efficiency and performance. This paper proposes a novel lightweight multi-scale feature extraction layer, termed the LMF layer, which employs depthwise separable dilated convolutions in a fully connected structure. By integrating multiple LMF layers, we develop LMFNet, a lightweight network tailored for salient object detection. Our approach significantly reduces the number of parameters while maintaining competitive performance. Here, we show that LMFNet achieves state-of-the-art or comparable results on five benchmark datasets with only 0.81M parameters, outperforming several traditional and lightweight models in terms of both efficiency and accuracy. Our work not only addresses the challenge of multi-scale learning in lightweight networks but also demonstrates the potential for broader applications in image processing tasks. The related code files are available at https://github.com/Shi-Yun-peng/LMFNet
Related papers
- From One-to-One to Many-to-Many: Dynamic Cross-Layer Injection for Deep Vision-Language Fusion [91.35078719566472]
Vision-Language Models (VLMs) create a severe visual feature bottleneck by using a crude, asymmetric connection.<n>We introduce Cross-Layer Injection (CLI), a novel and lightweight framework that forges a dynamic many-to-many bridge between the two modalities.
arXiv Detail & Related papers (2026-01-15T18:59:10Z) - LGM-Pose: A Lightweight Global Modeling Network for Real-time Human Pose Estimation [9.000760165185532]
A single-branch lightweight global modeling network (LGM-Pose) is proposed to address these challenges.<n>In the network, a lightweight MobileViM Block is designed with a proposed Lightweight Attentional Representation Module (LARM)
arXiv Detail & Related papers (2025-06-05T02:29:04Z) - LSNet: See Large, Focus Small [67.05569159984691]
We introduce LS (textbfLarge-textbfSmall) convolution, which combines large- kernel perception and small- kernel aggregation.<n>LSNet achieves superior performance and efficiency over existing lightweight networks in various vision tasks.
arXiv Detail & Related papers (2025-03-29T16:00:54Z) - LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks [20.924609707499915]
This article introduces LWGANet, a specialized lightweight backbone network tailored for RS visual tasks.<n>LWGA module, tailored for RS imagery, adeptly harnesses redundant features to extract a wide range of spatial information.<n>The results confirm LWGANet's widespread applicability and its ability to maintain an optimal balance between high performance and low complexity.
arXiv Detail & Related papers (2025-01-17T08:56:17Z) - WDMoE: Wireless Distributed Mixture of Experts for Large Language Models [68.45482959423323]
Large Language Models (LLMs) have achieved significant success in various natural language processing tasks.
We propose a wireless distributed Mixture of Experts (WDMoE) architecture to enable collaborative deployment of LLMs across edge servers at the base station (BS) and mobile devices in wireless networks.
arXiv Detail & Related papers (2024-11-11T02:48:00Z) - Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models [87.47400128150032]
We propose a novel LMM architecture named Lumen, a Large multimodal model with versatile vision-centric capability enhancement.
Lumen first promotes fine-grained vision-language concept alignment.
Then the task-specific decoding is carried out by flexibly routing the shared representation to lightweight task decoders.
arXiv Detail & Related papers (2024-03-12T04:13:45Z) - Lightweight Salient Object Detection in Optical Remote-Sensing Images
via Semantic Matching and Edge Alignment [61.45639694373033]
We propose a novel lightweight network for optical remote sensing images (ORSI-SOD) based on semantic matching and edge alignment, termed SeaNet.
Specifically, SeaNet includes a lightweight MobileNet-V2 for feature extraction, a dynamic semantic matching module (DSMM) for high-level features, and a portable decoder for inference.
arXiv Detail & Related papers (2023-01-07T04:33:51Z) - MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for
Real-Time Semantic Segmentation [27.232578592161673]
We devise a novel lightweight network using a multi-scale context fusion scheme (MSCFNet)
The proposed MSCFNet contains only 1.15M parameters, achieves 71.9% Mean IoU and can run at over 50 FPS on a single Titan XP GPU configuration.
arXiv Detail & Related papers (2021-03-24T08:28:26Z) - Lightweight Image Super-Resolution with Multi-scale Feature Interaction
Network [15.846394239848959]
We present a lightweight multi-scale feature interaction network (MSFIN)
For lightweight SISR, MSFIN expands the receptive field and adequately exploits the informative features of the low-resolution observed images.
Our proposed MSFIN can achieve comparable performance against the state-of-the-arts with a more lightweight model.
arXiv Detail & Related papers (2021-03-24T07:25:21Z) - Feature Flow: In-network Feature Flow Estimation for Video Object
Detection [56.80974623192569]
Optical flow is widely used in computer vision tasks to provide pixel-level motion information.
A common approach is to:forward optical flow to a neural network and fine-tune this network on the task dataset.
We propose a novel network (IFF-Net) with an textbfIn-network textbfFeature textbfFlow estimation module for video object detection.
arXiv Detail & Related papers (2020-09-21T07:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.