BAFNet: Bilateral Attention Fusion Network for Lightweight Semantic Segmentation of Urban Remote Sensing Images
- URL: http://arxiv.org/abs/2409.10269v1
- Date: Mon, 16 Sep 2024 13:25:42 GMT
- Title: BAFNet: Bilateral Attention Fusion Network for Lightweight Semantic Segmentation of Urban Remote Sensing Images
- Authors: Wentao Wang, Xili Wang,
- Abstract summary: We propose a lightweight semantic segmentation network called bilateral attention fusion network (BAFNet) to efficiently segment high-resolution urban remote sensing images.
BAFNet outperforms advanced lightweight models in accuracy but also demonstrates comparable performance to non-lightweight state-of-the-art methods on two datasets.
- Score: 6.153725909241752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale semantic segmentation networks often achieve high performance, while their application can be challenging when faced with limited sample sizes and computational resources. In scenarios with restricted network size and computational complexity, models encounter significant challenges in capturing long-range dependencies and recovering detailed information in images. We propose a lightweight bilateral semantic segmentation network called bilateral attention fusion network (BAFNet) to efficiently segment high-resolution urban remote sensing images. The model consists of two paths, namely dependency path and remote-local path. The dependency path utilizes large kernel attention to acquire long-range dependencies in the image. Besides, multi-scale local attention and efficient remote attention are designed to construct remote-local path. Finally, a feature aggregation module is designed to effectively utilize the different features of the two paths. Our proposed method was tested on public high-resolution urban remote sensing datasets Vaihingen and Potsdam, with mIoU reaching 83.20% and 86.53%, respectively. As a lightweight semantic segmentation model, BAFNet not only outperforms advanced lightweight models in accuracy but also demonstrates comparable performance to non-lightweight state-of-the-art methods on two datasets, despite a tenfold variance in floating-point operations and a fifteenfold difference in network parameters.
Related papers
- Multi-view Aggregation Network for Dichotomous Image Segmentation [76.75904424539543]
Dichotomous Image (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images.
Existing methods rely on tedious multiple encoder-decoder streams and stages to gradually complete the global localization and local refinement.
Inspired by it, we model DIS as a multi-view object perception problem and provide a parsimonious multi-view aggregation network (MVANet)
Experiments on the popular DIS-5K dataset show that our MVANet significantly outperforms state-of-the-art methods in both accuracy and speed.
arXiv Detail & Related papers (2024-04-11T03:00:00Z) - Lightweight Salient Object Detection in Optical Remote-Sensing Images
via Semantic Matching and Edge Alignment [61.45639694373033]
We propose a novel lightweight network for optical remote sensing images (ORSI-SOD) based on semantic matching and edge alignment, termed SeaNet.
Specifically, SeaNet includes a lightweight MobileNet-V2 for feature extraction, a dynamic semantic matching module (DSMM) for high-level features, and a portable decoder for inference.
arXiv Detail & Related papers (2023-01-07T04:33:51Z) - DPNet: Dual-Path Network for Real-time Object Detection with Lightweight
Attention [15.360769793764526]
This paper presents a dual-path network, named DPNet, with a lightweight attention scheme for real-time object detection.
DPNet achieves state-of-the-art trade-off between detection accuracy and implementation efficiency.
arXiv Detail & Related papers (2022-09-28T09:11:01Z) - Learning to Aggregate Multi-Scale Context for Instance Segmentation in
Remote Sensing Images [28.560068780733342]
A novel context aggregation network (CATNet) is proposed to improve the feature extraction process.
The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid ( SCP), and hierarchical region of interest extractor (HRoIE)
arXiv Detail & Related papers (2021-11-22T08:55:25Z) - DPNET: Dual-Path Network for Efficient Object Detectioj with Lightweight
Self-Attention [16.13989397708127]
DPNet is a dual path network for efficient object detection with lightweight self-attention.
It achieves 29.0% AP on COCO dataset, with only 1.14 GFLOPs and 2.27M model size for a 320x320 image.
arXiv Detail & Related papers (2021-10-31T13:38:16Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Boundary-Aware Segmentation Network for Mobile and Web Applications [60.815545591314915]
Boundary-Aware Network (BASNet) is integrated with a predict-refine architecture and a hybrid loss for highly accurate image segmentation.
BASNet runs at over 70 fps on a single GPU which benefits many potential real applications.
Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is augmented reality for "COPY" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal.
arXiv Detail & Related papers (2021-01-12T19:20:26Z) - Multi-Attention-Network for Semantic Segmentation of Fine Resolution
Remote Sensing Images [10.835342317692884]
The accuracy of semantic segmentation in remote sensing images has been increased significantly by deep convolutional neural networks.
This paper proposes a Multi-Attention-Network (MANet) to address these issues.
A novel attention mechanism of kernel attention with linear complexity is proposed to alleviate the large computational demand in attention.
arXiv Detail & Related papers (2020-09-03T09:08:02Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Multi-scale Interaction for Real-time LiDAR Data Segmentation on an
Embedded Platform [62.91011959772665]
Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles.
Current approaches that operate directly on the point cloud use complex spatial aggregation operations.
We propose a projection-based method, called Multi-scale Interaction Network (MINet), which is very efficient and accurate.
arXiv Detail & Related papers (2020-08-20T19:06:11Z) - A novel Deep Structure U-Net for Sea-Land Segmentation in Remote Sensing
Images [30.39131853354783]
This paper presents a novel deep neural network structure for pixel-wise sea-land segmentation, a Residual Dense U-Net (RDU-Net)
RDU-Net is a combination of both down-sampling and up-sampling paths to achieve satisfactory results.
arXiv Detail & Related papers (2020-03-17T16:00:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.