Full-scale Representation Guided Network for Retinal Vessel Segmentation
- URL: http://arxiv.org/abs/2501.18921v1
- Date: Fri, 31 Jan 2025 06:52:57 GMT
- Title: Full-scale Representation Guided Network for Retinal Vessel Segmentation
- Authors: Sunyong Seo, Huisu Yoon, Semin Kim, Jongha Lee,
- Abstract summary: U-Net has remained state-of-the-art (SOTA) for retinal vessel segmentation over the past decade.<n>We introduce a Full Scale Guided Network (FSG-Net), where the feature representation network with modernized convolution blocks extracts full-scale information.<n>We show that the proposed network demonstrates competitive results compared to current SOTA models on various public datasets.
- Score: 1.3024517678456733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The U-Net architecture and its variants have remained state-of-the-art (SOTA) for retinal vessel segmentation over the past decade. In this study, we introduce a Full Scale Guided Network (FSG-Net), where the feature representation network with modernized convolution blocks extracts full-scale information and the guided convolution block refines that information. Attention-guided filter is introduced to the guided convolution block under the interpretation that the filter behaves like the unsharp mask filter. Passing full-scale information to the attention block allows for the generation of improved attention maps, which are then passed to the attention-guided filter, resulting in performance enhancement of the segmentation network. The structure preceding the guided convolution block can be replaced by any U-Net variant, which enhances the scalability of the proposed approach. For a fair comparison, we re-implemented recent studies available in public repositories to evaluate their scalability and reproducibility. Our experiments also show that the proposed network demonstrates competitive results compared to current SOTA models on various public datasets. Ablation studies demonstrate that the proposed model is competitive with much smaller parameter sizes. Lastly, by applying the proposed model to facial wrinkle segmentation, we confirmed the potential for scalability to similar tasks in other domains. Our code is available on https://github.com/ZombaSY/FSG-Net-pytorch.
Related papers
- Threshold Attention Network for Semantic Segmentation of Remote Sensing Images [3.5449012582104795]
Self-attention mechanism (SA) is an effective approach for designing segmentation networks.<n>We propose a novel threshold attention mechanism (TAM) for semantic segmentation.<n>Based on TAM, we present a threshold attention network (TANet) for semantic segmentation.
arXiv Detail & Related papers (2025-01-14T10:09:55Z) - The Master Key Filters Hypothesis: Deep Filters Are General [51.900488744931785]
Convolutional neural network (CNN) filters become increasingly specialized in deeper layers.
Recent observations of clusterable repeating patterns in depthwise separable CNNs (DS-CNNs) trained on ImageNet motivated this paper.
Our analysis of DS-CNNs reveals that deep filters maintain generality, contradicting the expected transition to class-specific filters.
arXiv Detail & Related papers (2024-12-21T20:04:23Z) - GraftNet: Towards Domain Generalized Stereo Matching with a
Broad-Spectrum and Task-Oriented Feature [2.610470075814367]
We propose to leverage the feature of a model trained on large-scale datasets to deal with the domain shift.
With the cosine similarity based cost volume as a bridge, the feature will be grafted to an ordinary cost aggregation module.
Experiments show that the model generalization ability can be improved significantly with this broad-spectrum and task-oriented feature.
arXiv Detail & Related papers (2022-04-01T03:10:04Z) - Swin Transformer coupling CNNs Makes Strong Contextual Encoders for VHR
Image Road Extraction [11.308473487002782]
We propose a dual-branch network block named ConSwin that combines ResNet and SwinTransformers for road extraction tasks.
Our proposed method outperforms state-of-the-art methods on both the Massachusetts and CHN6-CUG datasets in terms of overall accuracy, IOU, and F1 indicators.
arXiv Detail & Related papers (2022-01-10T06:05:12Z) - Augmenting Convolutional networks with attention-based aggregation [55.97184767391253]
We show how to augment any convolutional network with an attention-based global map to achieve non-local reasoning.
We plug this learned aggregation layer with a simplistic patch-based convolutional network parametrized by 2 parameters (width and depth)
It yields surprisingly competitive trade-offs between accuracy and complexity, in particular in terms of memory consumption.
arXiv Detail & Related papers (2021-12-27T14:05:41Z) - Crowd Counting via Perspective-Guided Fractional-Dilation Convolution [75.36662947203192]
This paper proposes a novel convolution neural network-based crowd counting method, termed Perspective-guided Fractional-Dilation Network (PFDNet)
By modeling the continuous scale variations, the proposed PFDNet is able to select the proper fractional dilation kernels for adapting to different spatial locations.
It significantly improves the flexibility of the state-of-the-arts that only consider the discrete representative scales.
arXiv Detail & Related papers (2021-07-08T07:57:00Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Scaling Wide Residual Networks for Panoptic Segmentation [29.303735643858026]
Wide Residual Networks (Wide-ResNets) are a shallow but wide model variant of the Residual Networks (ResNets)
We revisit its architecture design for the recent challenging panoptic segmentation task, which aims to unify semantic segmentation and instance segmentation.
We demonstrate that such a simple scaling scheme, coupled with grid search, identifies several SWideRNets that significantly advance state-of-the-art performance on panoptic segmentation datasets in both the fast model regime and strong model regime.
arXiv Detail & Related papers (2020-11-23T19:14:11Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - Semantic Segmentation With Multi Scale Spatial Attention For Self
Driving Cars [2.7317088388886384]
We present a novel neural network using multi scale feature fusion at various scales for accurate and efficient semantic image segmentation.
We used ResNet based feature extractor, dilated convolutional layers in downsampling part, atrous convolutional layers in the upsampling part and used concat operation to merge them.
A new attention module is proposed to encode more contextual information and enhance the receptive field of the network.
arXiv Detail & Related papers (2020-06-30T20:19:09Z) - ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations.
Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.