Improved-Flow Warp Module for Remote Sensing Semantic Segmentation
- URL: http://arxiv.org/abs/2205.04160v1
- Date: Mon, 9 May 2022 10:15:18 GMT
- Title: Improved-Flow Warp Module for Remote Sensing Semantic Segmentation
- Authors: Yinjie Zhang, Yi Liu, Wei Guo
- Abstract summary: We propose a new module, called improved-flow warp module (IFWM), to adjust semantic feature maps across different scales for remote sensing semantic segmentation.
IFWM computes the offsets of pixels by a learnable way, which can alleviate the misalignment of the multi-scale features.
We validate our method on several remote sensing datasets, and the results prove the effectiveness of our method.
- Score: 9.505303195320023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote sensing semantic segmentation aims to assign automatically each pixel
on aerial images with specific label. In this letter, we proposed a new module,
called improved-flow warp module (IFWM), to adjust semantic feature maps across
different scales for remote sensing semantic segmentation. The improved-flow
warp module is applied along with the feature extraction process in the
convolutional neural network. First, IFWM computes the offsets of pixels by a
learnable way, which can alleviate the misalignment of the multi-scale
features. Second, the offsets help with the low-resolution deep feature
up-sampling process to improve the feature accordance, which boosts the
accuracy of semantic segmentation. We validate our method on several remote
sensing datasets, and the results prove the effectiveness of our method..
Related papers
- Leveraging Adaptive Implicit Representation Mapping for Ultra High-Resolution Image Segmentation [19.87987918759425]
Implicit representation mapping (IRM) can translate image features to any continuous resolution, showcasing its potent capability for ultra-high-resolution image segmentation refinement.
Current IRM-based methods for refining ultra-high-resolution image segmentation often rely on CNN-based encoders to extract image features.
We propose a novel approach that leverages the newly proposed Implicit Representation Mapping (AIRM) for ultra-high-resolution Image Function.
arXiv Detail & Related papers (2024-07-31T00:34:37Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using
Multi-Scale Convolution Attention [0.0]
We propose a projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based semantic segmentation.
The encoder is composed of a novel multi-scale convolutional attention (MSCA) module with varying receptive field sizes to capture features.
We show that our proposed method is lighter, more efficient, and robust compared to state-of-the-art semantic segmentation methods.
arXiv Detail & Related papers (2023-01-11T02:51:38Z) - Unsupervised domain adaptation semantic segmentation of high-resolution
remote sensing imagery with invariant domain-level context memory [10.210120085157161]
This study proposes a novel unsupervised domain adaptation semantic segmentation network (MemoryAdaptNet) for the semantic segmentation of HRS imagery.
MemoryAdaptNet constructs an output space adversarial learning scheme to bridge the domain distribution discrepancy between source domain and target domain.
Experiments under three cross-domain tasks indicate that our proposed MemoryAdaptNet is remarkably superior to the state-of-the-art methods.
arXiv Detail & Related papers (2022-08-16T12:35:57Z) - Semantic Labeling of High Resolution Images Using EfficientUNets and
Transformers [5.177947445379688]
We propose a new segmentation model that combines convolutional neural networks with deep transformers.
Our results demonstrate that the proposed methodology improves segmentation accuracy compared to state-of-the-art techniques.
arXiv Detail & Related papers (2022-06-20T12:03:54Z) - Real-Time Scene Text Detection with Differentiable Binarization and
Adaptive Scale Fusion [62.269219152425556]
segmentation-based scene text detection methods have drawn extensive attention in the scene text detection field.
We propose a Differentiable Binarization (DB) module that integrates the binarization process into a segmentation network.
An efficient Adaptive Scale Fusion (ASF) module is proposed to improve the scale robustness by fusing features of different scales adaptively.
arXiv Detail & Related papers (2022-02-21T15:30:14Z) - CE-FPN: Enhancing Channel Information for Object Detection [12.954675966833372]
Feature pyramid network (FPN) has been an effective framework to extract multi-scale features in object detection.
We present a novel channel enhancement network (CE-FPN) with three simple yet effective modules to alleviate these problems.
Our experiments show that CE-FPN achieves competitive performance compared to state-of-the-art FPN-based detectors on MS COCO benchmark.
arXiv Detail & Related papers (2021-03-19T05:51:53Z) - Feature Flow: In-network Feature Flow Estimation for Video Object
Detection [56.80974623192569]
Optical flow is widely used in computer vision tasks to provide pixel-level motion information.
A common approach is to:forward optical flow to a neural network and fine-tune this network on the task dataset.
We propose a novel network (IFF-Net) with an textbfIn-network textbfFeature textbfFlow estimation module for video object detection.
arXiv Detail & Related papers (2020-09-21T07:55:50Z) - Dual Attention GANs for Semantic Image Synthesis [101.36015877815537]
We propose a novel Dual Attention GAN (DAGAN) to synthesize photo-realistic and semantically-consistent images.
We also propose two novel modules, i.e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM)
DAGAN achieves remarkably better results than state-of-the-art methods, while using fewer model parameters.
arXiv Detail & Related papers (2020-08-29T17:49:01Z) - Volumetric Transformer Networks [88.85542905676712]
We introduce a learnable module, the volumetric transformer network (VTN)
VTN predicts channel-wise warping fields so as to reconfigure intermediate CNN features spatially and channel-wisely.
Our experiments show that VTN consistently boosts the features' representation power and consequently the networks' accuracy on fine-grained image recognition and instance-level image retrieval.
arXiv Detail & Related papers (2020-07-18T14:00:12Z) - AlignSeg: Feature-Aligned Segmentation Networks [109.94809725745499]
We propose Feature-Aligned Networks (AlignSeg) to address misalignment issues during the feature aggregation process.
Our network achieves new state-of-the-art mIoU scores of 82.6% and 45.95%, respectively.
arXiv Detail & Related papers (2020-02-24T10:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.