Dynamic Dual Sampling Module for Fine-Grained Semantic Segmentation
- URL: http://arxiv.org/abs/2105.11657v1
- Date: Tue, 25 May 2021 04:25:47 GMT
- Title: Dynamic Dual Sampling Module for Fine-Grained Semantic Segmentation
- Authors: Chen Shi, Xiangtai Li, Yanran Wu, Yunhai Tong, Yi Xu
- Abstract summary: We propose a Dynamic Dual Sampling Module (DDSM) to conduct dynamic affinity modeling and propagate semantic context to local details.
Experiment results on both City and Camvid datasets validate the effectiveness and efficiency of the proposed approach.
- Score: 27.624291416260185
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Representation of semantic context and local details is the essential issue
for building modern semantic segmentation models. However, the
interrelationship between semantic context and local details is not well
explored in previous works. In this paper, we propose a Dynamic Dual Sampling
Module (DDSM) to conduct dynamic affinity modeling and propagate semantic
context to local details, which yields a more discriminative representation.
Specifically, a dynamic sampling strategy is used to sparsely sample
representative pixels and channels in the higher layer, forming adaptive
compact support for each pixel and channel in the lower layer. The sampled
features with high semantics are aggregated according to the affinities and
then propagated to detailed lower-layer features, leading to a fine-grained
segmentation result with well-preserved boundaries. Experiment results on both
Cityscapes and Camvid datasets validate the effectiveness and efficiency of the
proposed approach. Code and models will be available at
\url{x3https://github.com/Fantasticarl/DDSM}.
Related papers
- A Noise and Edge extraction-based dual-branch method for Shallowfake and Deepfake Localization [15.647035299476894]
We develop a dual-branch model that integrates manually designed feature noise with conventional CNN features.
The model is superior in comparison and easily outperforms the existing state-of-the-art (SoTA) models.
arXiv Detail & Related papers (2024-09-02T02:18:34Z) - StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization [85.18995948334592]
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain.
State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data.
We propose emphStyDeSty, which explicitly accounts for the alignment of the source and pseudo domains in the process of data augmentation.
arXiv Detail & Related papers (2024-06-01T02:41:34Z) - MetaSeg: Content-Aware Meta-Net for Omni-Supervised Semantic
Segmentation [17.59676962334776]
Noisy labels, inevitably existing in pseudo segmentation labels generated from weak object-level annotations, severely hampers model optimization for semantic segmentation.
Inspired by recent advances in meta learning, we argue that rather than struggling to tolerate noise hidden behind clean labels passively, a more feasible solution would be to find out the noisy regions actively.
We present a novel meta learning based semantic segmentation method, MetaSeg, that comprises a primary content-aware meta-net (CAM-Net) to sever as a noise indicator for an arbitrary segmentation model counterpart.
arXiv Detail & Related papers (2024-01-22T07:31:52Z) - IDRNet: Intervention-Driven Relation Network for Semantic Segmentation [34.09179171102469]
Co-occurrent visual patterns suggest that pixel relation modeling facilitates dense prediction tasks.
Despite the impressive results, existing paradigms often suffer from inadequate or ineffective contextual information aggregation.
We propose a novel textbfIntervention-textbfDriven textbfRelation textbfNetwork.
arXiv Detail & Related papers (2023-10-16T18:37:33Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks.
Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z) - EAN: Event Adaptive Network for Enhanced Action Recognition [66.81780707955852]
We propose a unified action recognition framework to investigate the dynamic nature of video content.
First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events.
Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer.
arXiv Detail & Related papers (2021-07-22T15:57:18Z) - BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes.
Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary.
Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z) - Unsupervised Learning Consensus Model for Dynamic Texture Videos
Segmentation [12.462608802359936]
We present an effective unsupervised learning consensus model for the segmentation of dynamic texture (ULCM)
In the proposed model, the set of values of the requantized local binary patterns (LBP) histogram around the pixel to be classified are used as features.
Experiments conducted on the challenging SynthDB dataset show that ULCM is significantly faster, easier to code, simple and has limited parameters.
arXiv Detail & Related papers (2020-06-29T16:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.