Coupling Global Context and Local Contents for Weakly-Supervised
Semantic Segmentation
- URL: http://arxiv.org/abs/2304.09059v2
- Date: Wed, 26 Apr 2023 06:33:30 GMT
- Title: Coupling Global Context and Local Contents for Weakly-Supervised
Semantic Segmentation
- Authors: Chunyan Wang, Dong Zhang, Liyan Zhang, Jinhui Tang
- Abstract summary: We propose a single-stage WeaklySupervised Semantic (WSSS) model with only the image-level class label supervisions.
A flexible context aggregation module is proposed to capture the global object context in different granular spaces.
A semantically consistent feature fusion module is proposed in a bottom-up parameter-learnable fashion to aggregate the fine-grained local contents.
- Score: 54.419401869108846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Thanks to the advantages of the friendly annotations and the satisfactory
performance, Weakly-Supervised Semantic Segmentation (WSSS) approaches have
been extensively studied. Recently, the single-stage WSSS was awakened to
alleviate problems of the expensive computational costs and the complicated
training procedures in multi-stage WSSS. However, results of such an immature
model suffer from problems of background incompleteness and object
incompleteness. We empirically find that they are caused by the insufficiency
of the global object context and the lack of the local regional contents,
respectively. Under these observations, we propose a single-stage WSSS model
with only the image-level class label supervisions, termed as Weakly Supervised
Feature Coupling Network (WS-FCN), which can capture the multi-scale context
formed from the adjacent feature grids, and encode the fine-grained spatial
information from the low-level features into the high-level ones. Specifically,
a flexible context aggregation module is proposed to capture the global object
context in different granular spaces. Besides, a semantically consistent
feature fusion module is proposed in a bottom-up parameter-learnable fashion to
aggregate the fine-grained local contents. Based on these two modules, WS-FCN
lies in a self-supervised end-to-end training fashion. Extensive experimental
results on the challenging PASCAL VOC 2012 and MS COCO 2014 demonstrate the
effectiveness and efficiency of WS-FCN, which can achieve state-of-the-art
results by 65.02\% and 64.22\% mIoU on PASCAL VOC 2012 val set and test set,
34.12\% mIoU on MS COCO 2014 val set, respectively. The code and weight have
been released at:https://github.com/ChunyanWang1/ws-fcn.
Related papers
- Enhanced Semantic Segmentation for Large-Scale and Imbalanced Point Clouds [6.253217784798542]
Small-sized objects are prone to be under-sampled or misclassified due to their low occurrence frequency.
We propose the Multilateral Cascading Network (MCNet) for large-scale and sample-imbalanced point cloud scenes.
arXiv Detail & Related papers (2024-09-21T02:23:01Z) - FLea: Addressing Data Scarcity and Label Skew in Federated Learning via Privacy-preserving Feature Augmentation [15.298650496155508]
Federated Learning (FL) enables model development by leveraging data distributed across numerous edge devices without transferring local data to a central server.
Existing FL methods face challenges when dealing with scarce and label-skewed data across devices, resulting in local model overfitting and drift.
We propose a pioneering framework called textitFLea, incorporating the following key components.
arXiv Detail & Related papers (2023-12-04T20:24:09Z) - Submodel Partitioning in Hierarchical Federated Learning: Algorithm
Design and Convergence Analysis [15.311309249848739]
Hierarchical learning (FL) has demonstrated promising scalability advantages over the traditional "star-topology" architecture-based federated learning (FL)
In this paper, we propose independent sub training overconstrained Internet of Things (IoT)
Key idea behind HIST is a global version of model computation, where we partition the global model into disjoint submodels in each round, and distribute them across different cells.
arXiv Detail & Related papers (2023-10-27T04:42:59Z) - Global Relation Modeling and Refinement for Bottom-Up Human Pose
Estimation [4.24515544235173]
We propose a convolutional neural network for bottom-up human pose estimation.
Our model has the ability to focus on different granularity from local to global regions.
Our results on the COCO and CrowdPose datasets demonstrate that it is an efficient framework for multi-person pose estimation.
arXiv Detail & Related papers (2023-03-27T02:54:08Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks.
Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z) - Prior Guided Feature Enrichment Network for Few-Shot Segmentation [64.91560451900125]
State-of-the-art semantic segmentation methods require sufficient labeled data to achieve good results.
Few-shot segmentation is proposed to tackle this problem by learning a model that quickly adapts to new classes with a few labeled support samples.
Theses frameworks still face the challenge of generalization ability reduction on unseen classes due to inappropriate use of high-level semantic information.
arXiv Detail & Related papers (2020-08-04T10:41:32Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.