UACANet: Uncertainty Augmented Context Attention for Polyp Semgnetaion
- URL: http://arxiv.org/abs/2107.02368v1
- Date: Tue, 6 Jul 2021 03:11:12 GMT
- Title: UACANet: Uncertainty Augmented Context Attention for Polyp Semgnetaion
- Authors: Taehun Kim, Hyemin Lee, Daijin Kim
- Abstract summary: We construct a modified version of U-Net shape network with additional encoder and decoder.
In each prediction module, previously predicted saliency map is utilized to compute foreground, background and uncertain area map.
We achieve 76.6% mean Dice on ETIS dataset which is 13.8% improvement compared to the previous state-of-the-art method.
- Score: 12.089183640843416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Uncertainty Augmented Context Attention network (UACANet) for
polyp segmentation which consider a uncertain area of the saliency map. We
construct a modified version of U-Net shape network with additional encoder and
decoder and compute a saliency map in each bottom-up stream prediction module
and propagate to the next prediction module. In each prediction module,
previously predicted saliency map is utilized to compute foreground, background
and uncertain area map and we aggregate the feature map with three area maps
for each representation. Then we compute the relation between each
representation and each pixel in the feature map. We conduct experiments on
five popular polyp segmentation benchmarks, Kvasir, CVC-ClinicDB, ETIS,
CVC-ColonDB and CVC-300, and achieve state-of-the-art performance. Especially,
we achieve 76.6% mean Dice on ETIS dataset which is 13.8% improvement compared
to the previous state-of-the-art method.
Related papers
- SATR: Zero-Shot Semantic Segmentation of 3D Shapes [74.08209893396271]
We explore the task of zero-shot semantic segmentation of 3D shapes by using large-scale off-the-shelf 2D image recognition models.
We develop the Assignment with Topological Reweighting (SATR) algorithm and evaluate it on ShapeNetPart and our proposed FAUST benchmarks.
SATR achieves state-of-the-art performance and outperforms a baseline algorithm by 1.3% and 4% average mIoU.
arXiv Detail & Related papers (2023-04-11T00:43:16Z) - Semantic Segmentation by Early Region Proxy [53.594035639400616]
We present a novel and efficient modeling that starts from interpreting the image as a tessellation of learnable regions.
To model region-wise context, we exploit Transformer to encode regions in a sequence-to-sequence manner.
Semantic segmentation is now carried out as per-region prediction on top of the encoded region embeddings.
arXiv Detail & Related papers (2022-03-26T10:48:32Z) - Improving Lidar-Based Semantic Segmentation of Top-View Grid Maps by
Learning Features in Complementary Representations [3.0413873719021995]
We introduce a novel way to predict semantic information from sparse, single-shot LiDAR measurements in the context of autonomous driving.
The approach is aimed specifically at improving the semantic segmentation of top-view grid maps.
For each representation a tailored deep learning architecture is developed to effectively extract semantic information.
arXiv Detail & Related papers (2022-03-02T14:49:51Z) - ACDNet: Adaptively Combined Dilated Convolution for Monocular Panorama
Depth Estimation [9.670696363730329]
We propose an ACDNet based on the adaptively combined dilated convolution to predict the dense depth map for a monocular panoramic image.
We conduct depth estimation experiments on three datasets (both virtual and real-world) and the experimental results demonstrate that our proposed ACDNet substantially outperforms the current state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2021-12-29T08:04:19Z) - End-to-End Segmentation via Patch-wise Polygons Prediction [93.91375268580806]
The leading segmentation methods represent the output map as a pixel grid.
We study an alternative representation in which the object edges are modeled, per image patch, as a polygon with $k$ vertices that is coupled with per-patch label probabilities.
arXiv Detail & Related papers (2021-12-05T10:42:40Z) - Dynamic Semantic Occupancy Mapping using 3D Scene Flow and Closed-Form
Bayesian Inference [3.0389083199673337]
We leverage state-of-the-art semantic segmentation and 3D flow estimation using deep learning to provide measurements for map inference.
We develop a continuous (i.e., can be queried at arbitrary resolution) Bayesian model that propagates the scene with flow and infers a 3D semantic occupancy map with better performance than its static counterpart.
arXiv Detail & Related papers (2021-08-06T15:51:40Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - CPP-Net: Context-aware Polygon Proposal Network for Nucleus Segmentation [71.81734047345587]
We propose a Context-aware Polygon Proposal Network ( CPP-Net) for nucleus segmentation.
First, we sample a point set rather than one single pixel within each cell for distance prediction.
Second, we propose a Confidence-based Weighting Module, which adaptively fuses the predictions from the sampled point set.
Third, we introduce a novel Shape-Aware Perceptual (SAP) loss that constrains the shape of the predicted polygons.
arXiv Detail & Related papers (2021-02-13T05:59:52Z) - TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with
Diamond inceptiOn module [23.112192919085825]
TORNADO-Net is a neural network for 3D LiDAR point cloud semantic segmentation.
We incorporate a multi-view (bird-eye and range) projection feature extraction with an encoder-decoder ResNet architecture.
We also take advantage of the fact that the LiDAR data encompasses 360 degrees field of view and uses circular padding.
arXiv Detail & Related papers (2020-08-24T16:32:41Z) - Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement [54.29252286561449]
We propose a two-stage graph-based and model-agnostic framework, called Graph-PCNN.
In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled.
In the second stage, for each guided point, different visual feature is extracted by the localization.
The relationship between guided points is explored by the graph pose refinement module to get more accurate localization results.
arXiv Detail & Related papers (2020-07-21T04:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.