SA-MixNet: Structure-aware Mixup and Invariance Learning for
Scribble-supervised Road Extraction in Remote Sensing Images
- URL: http://arxiv.org/abs/2403.01381v1
- Date: Sun, 3 Mar 2024 02:56:43 GMT
- Title: SA-MixNet: Structure-aware Mixup and Invariance Learning for
Scribble-supervised Road Extraction in Remote Sensing Images
- Authors: Jie Feng, Hao Huang, Junpeng Zhang, Weisheng Dong, Dingwen Zhang,
Licheng Jiao
- Abstract summary: We propose a structure-aware Mixup scheme to paste road regions from one image onto another for creating an image scene with increased complexity.
A discriminator-based regularization is designed for enhancing the connectivity meanwhile preserving the structure of roads.
Our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets.
- Score: 85.52629779976137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mainstreamed weakly supervised road extractors rely on highly confident
pseudo-labels propagated from scribbles, and their performance often degrades
gradually as the image scenes tend various. We argue that such degradation is
due to the poor model's invariance to scenes with different complexities,
whereas existing solutions to this problem are commonly based on crafted priors
that cannot be derived from scribbles. To eliminate the reliance on such
priors, we propose a novel Structure-aware Mixup and Invariance Learning
framework (SA-MixNet) for weakly supervised road extraction that improves the
model invariance in a data-driven manner. Specifically, we design a
structure-aware Mixup scheme to paste road regions from one image onto another
for creating an image scene with increased complexity while preserving the
road's structural integrity. Then an invariance regularization is imposed on
the predictions of constructed and origin images to minimize their conflicts,
which thus forces the model to behave consistently on various scenes. Moreover,
a discriminator-based regularization is designed for enhancing the connectivity
meanwhile preserving the structure of roads. Combining these designs, our
framework demonstrates superior performance on the DeepGlobe, Wuhan, and
Massachusetts datasets outperforming the state-of-the-art techniques by 1.47%,
2.12%, 4.09% respectively in IoU metrics, and showing its potential of
plug-and-play. The code will be made publicly available.
Related papers
- Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis [43.481539150288434]
This work introduces a new family of.
factor graph Diffusion Models (FG-DMs)
FG-DMs models the joint distribution of.
images and conditioning variables, such as semantic, sketch,.
deep or normal maps via a factor graph decomposition.
arXiv Detail & Related papers (2024-10-29T00:54:00Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference.
This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion.
The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - RISP: Rendering-Invariant State Predictor with Differentiable Simulation
and Rendering for Cross-Domain Parameter Estimation [110.4255414234771]
Existing solutions require massive training data or lack generalizability to unknown rendering configurations.
We propose a novel approach that marries domain randomization and differentiable rendering gradients to address this problem.
Our approach achieves significantly lower reconstruction errors and has better generalizability among unknown rendering configurations.
arXiv Detail & Related papers (2022-05-11T17:59:51Z) - Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models.
Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.