Related papers: UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters

UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters

URL: http://arxiv.org/abs/2410.06114v1
Date: Tue, 8 Oct 2024 15:10:09 GMT
Title: UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters
Authors: Kovvuri Sai Gopal Reddy, Bodduluri Saran, A. Mudit Adityaja, Saurabh J. Shigwan, Nitin Kumar, Snehasis Mukherjee,
Abstract summary: We propose an unsupervised segmentation framework with a pre-trained ViT. By harnessing the graph structure inherent within the image, the proposed method achieves a notable performance in segmentation. The proposed method provides state-of-the-art performance (even comparable to supervised methods) on benchmark image segmentation datasets.
Score: 10.940349832919699
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The data-hungry approach of supervised classification drives the interest of the researchers toward unsupervised approaches, especially for problems such as medical image segmentation, where labeled data are difficult to get. Motivated by the recent success of Vision transformers (ViT) in various computer vision tasks, we propose an unsupervised segmentation framework with a pre-trained ViT. Moreover, by harnessing the graph structure inherent within the image, the proposed method achieves a notable performance in segmentation, especially in medical images. We further introduce a modularity-based loss function coupled with an Auto-Regressive Moving Average (ARMA) filter to capture the inherent graph topology within the image. Finally, we observe that employing Scaled Exponential Linear Unit (SELU) and SILU (Swish) activation functions within the proposed Graph Neural Network (GNN) architecture enhances the performance of segmentation. The proposed method provides state-of-the-art performance (even comparable to supervised methods) on benchmark image segmentation datasets such as ECSSD, DUTS, and CUB, as well as challenging medical image segmentation datasets such as KVASIR, CVC-ClinicDB, ISIC-2018. The github repository of the code is available on \url{https://github.com/ksgr5566/UnSeGArmaNet}.

Related papers

Image Segmentation: Inducing graph-based learning [4.499833362998488]
This study explores the potential of graph neural networks (GNNs) to enhance semantic segmentation across diverse image modalities. GNNs explicitly model relationships between image regions by constructing and operating on a graph representation of the image features. Our analysis demonstrates the versatility of GNNs in addressing diverse segmentation challenges and highlights their potential to improve segmentation accuracy in various applications.
arXiv Detail & Related papers (2025-01-07T13:09:44Z)
UnSegMedGAT: Unsupervised Medical Image Segmentation using Graph Attention Networks Clustering [10.862430265350804]
We propose an unsupervised segmentation framework using a pre-trained Dino-ViT. We leverage the inherent graph structure within the image to realize a significant performance gain for segmentation in medical images. Our method achieves state-of-the-art performance, even significantly surpassing or matching that of existing (semi) technique such as MedSAM.
arXiv Detail & Related papers (2024-11-04T10:42:21Z)
UnSegGNet: Unsupervised Image Segmentation using Graph Neural Networks [9.268228808049951]
This research contributes to the broader field of unsupervised medical imaging and computer vision. It presents an innovative methodology for image segmentation that aligns with real-world challenges. The proposed method holds promise for diverse applications, including medical imaging, remote sensing, and object recognition.
arXiv Detail & Related papers (2024-05-09T19:02:00Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
Graph Information Bottleneck for Remote Sensing Segmentation [8.879224757610368]
This paper treats images as graph structures and introduces a simple contrastive vision GNN architecture for remote sensing segmentation. Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation. We replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks.
arXiv Detail & Related papers (2023-12-05T07:23:22Z)
Unsupervised Domain Adaptation with Histogram-gated Image Translation for Delayered IC Image Analysis [2.720699926154399]
Histogram-gated Image Translation (HGIT) is an unsupervised domain adaptation framework which transforms images from a given source dataset to the domain of a target dataset. Our method achieves the best performance compared to the reported domain adaptation techniques, and is also reasonably close to the fully supervised benchmark.
arXiv Detail & Related papers (2022-09-27T15:53:22Z)
Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes. Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z)
Learning Hierarchical Graph Representation for Image Manipulation Detection [50.04902159383709]
The objective of image manipulation detection is to identify and locate the manipulated regions in the images. Recent approaches mostly adopt the sophisticated Convolutional Neural Networks (CNNs) to capture the tampering artifacts left in the images. We propose a hierarchical Graph Convolutional Network (HGCN-Net), which consists of two parallel branches.
arXiv Detail & Related papers (2022-01-15T01:54:25Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation. We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths. In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z)
SCG-Net: Self-Constructing Graph Neural Networks for Semantic Segmentation [23.623276007011373]
We propose a module that learns a long-range dependency graph directly from the image and uses it to propagate contextual information efficiently. The module is optimised via a novel adaptive diagonal enhancement method and a variational lower bound. When incorporated into a neural network (SCG-Net), semantic segmentation is performed in an end-to-end manner and competitive performance.
arXiv Detail & Related papers (2020-09-03T12:13:09Z)
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment. Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.