Graph-Segmenter: Graph Transformer with Boundary-aware Attention for
Semantic Segmentation
- URL: http://arxiv.org/abs/2308.07592v1
- Date: Tue, 15 Aug 2023 06:30:19 GMT
- Title: Graph-Segmenter: Graph Transformer with Boundary-aware Attention for
Semantic Segmentation
- Authors: Zizhang Wu, Yuanzhu Gan, Tianhao Xu, Fan Wang
- Abstract summary: We propose a Graph-Segmenter, including a Graph Transformer and a Boundary-aware Attention module.
Our proposed network, a Graph Transformer with Boundary-aware Attention, can achieve state-of-the-art segmentation performance.
- Score: 14.716537714651576
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The transformer-based semantic segmentation approaches, which divide the
image into different regions by sliding windows and model the relation inside
each window, have achieved outstanding success. However, since the relation
modeling between windows was not the primary emphasis of previous work, it was
not fully utilized. To address this issue, we propose a Graph-Segmenter,
including a Graph Transformer and a Boundary-aware Attention module, which is
an effective network for simultaneously modeling the more profound relation
between windows in a global view and various pixels inside each window as a
local one, and for substantial low-cost boundary adjustment. Specifically, we
treat every window and pixel inside the window as nodes to construct graphs for
both views and devise the Graph Transformer. The introduced boundary-aware
attention module optimizes the edge information of the target objects by
modeling the relationship between the pixel on the object's edge. Extensive
experiments on three widely used semantic segmentation datasets (Cityscapes,
ADE-20k and PASCAL Context) demonstrate that our proposed network, a Graph
Transformer with Boundary-aware Attention, can achieve state-of-the-art
segmentation performance.
Related papers
- Graph Transformer GANs with Graph Masked Modeling for Architectural
Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions.
We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z) - Graph Information Bottleneck for Remote Sensing Segmentation [8.879224757610368]
This paper treats images as graph structures and introduces a simple contrastive vision GNN architecture for remote sensing segmentation.
Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation.
We replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks.
arXiv Detail & Related papers (2023-12-05T07:23:22Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Graph Reasoning Transformer for Image Parsing [67.76633142645284]
We propose a novel Graph Reasoning Transformer (GReaT) for image parsing to enable image patches to interact following a relation reasoning pattern.
Compared to the conventional transformer, GReaT has higher interaction efficiency and a more purposeful interaction pattern.
Results show that GReaT achieves consistent performance gains with slight computational overheads on the state-of-the-art transformer baselines.
arXiv Detail & Related papers (2022-09-20T08:21:37Z) - BI-GCN: Boundary-Aware Input-Dependent Graph Convolution Network for
Biomedical Image Segmentation [21.912509900254364]
We apply graph convolution into the segmentation task and propose an improved textitLaplacian.
Our method outperforms the state-of-the-art approaches on the segmentation of polyps in colonoscopy images and of the optic disc and optic cup in colour fundus images.
arXiv Detail & Related papers (2021-10-27T21:12:27Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - Segmentation-grounded Scene Graph Generation [47.34166260639392]
We propose a framework for pixel-level segmentation-grounded scene graph generation.
Our framework is agnostic to the underlying scene graph generation method.
It is learned in a multi-task manner with both target and auxiliary datasets.
arXiv Detail & Related papers (2021-04-29T08:54:08Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z) - Relation Transformer Network [25.141472361426818]
We propose a novel transformer formulation for scene graph generation and relation prediction.
We leverage the encoder-decoder architecture of the transformer for rich feature embedding of nodes and edges.
Our relation prediction module classifies the directed relation from the learned node and edge embedding.
arXiv Detail & Related papers (2020-04-13T20:47:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.