Selective Segmentation Networks Using Top-Down Attention
- URL: http://arxiv.org/abs/2002.01125v1
- Date: Tue, 4 Feb 2020 04:47:23 GMT
- Title: Selective Segmentation Networks Using Top-Down Attention
- Authors: Mahdi Biparva, John Tsotsos
- Abstract summary: Convolutional neural networks model the transformation of the input sensory data at the bottom of a network hierarchy to the semantic information at the top of the visual hierarchy.
We propose a unified 2-pass framework for object segmentation that augments Bottom-Up convnets with a Top-Down selection network.
We evaluate the proposed network on benchmark datasets for semantic segmentation, and show that networks with the Top-Down selection capability outperform the baseline model.
- Score: 1.0152838128195465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural networks model the transformation of the input sensory
data at the bottom of a network hierarchy to the semantic information at the
top of the visual hierarchy. Feedforward processing is sufficient for some
object recognition tasks. Top-Down selection is potentially required in
addition to the Bottom-Up feedforward pass. It can, in part, address the
shortcoming of the loss of location information imposed by the hierarchical
feature pyramids. We propose a unified 2-pass framework for object segmentation
that augments Bottom-Up \convnets with a Top-Down selection network. We utilize
the top-down selection gating activities to modulate the bottom-up hidden
activities for segmentation predictions. We develop an end-to-end multi-task
framework with loss terms satisfying task requirements at the two ends of the
network. We evaluate the proposed network on benchmark datasets for semantic
segmentation, and show that networks with the Top-Down selection capability
outperform the baseline model. Additionally, we shed light on the superior
aspects of the new segmentation paradigm and qualitatively and quantitatively
support the efficiency of the novel framework over the baseline model that
relies purely on parametric skip connections.
Related papers
- Lidar Panoptic Segmentation and Tracking without Bells and Whistles [48.078270195629415]
We propose a detection-centric network for lidar segmentation and tracking.
One of the core components of our network is the object instance detection branch.
We evaluate our method on several 3D/4D LPS benchmarks and observe that our model establishes a new state-of-the-art among open-sourced models.
arXiv Detail & Related papers (2023-10-19T04:44:43Z) - Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts.
We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query.
Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z) - Learning Target-aware Representation for Visual Tracking via Informative
Interactions [49.552877881662475]
We introduce a novel backbone architecture to improve target-perception ability of feature representation for tracking.
The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer.
arXiv Detail & Related papers (2022-01-07T16:22:27Z) - Quality-Aware Memory Network for Interactive Volumetric Image
Segmentation [15.504425842953676]
We propose a quality-aware memory network for interactive segmentation of 3D medical images.
A quality assessment module is introduced to suggest the next slice to segment based on the current segmentation quality of each slice.
The proposed network leads to a robust interactive segmentation engine, which can generalize well to various types of user annotations.
arXiv Detail & Related papers (2021-06-20T12:34:19Z) - A Novel Adaptive Deep Network for Building Footprint Segmentation [0.0]
We propose a novel network-based on Pix2Pix methodology to solve the problem of inaccurate boundaries obtained by converting satellite images into maps.
Our framework includes two generators where the first generator extracts localization features in order to merge them with the boundary features extracted from the second generator to segment all detailed building edges.
Different strategies are implemented to enhance the quality of the proposed networks' results, implying that the proposed network outperforms state-of-the-art networks in segmentation accuracy with a large margin for all evaluation metrics.
arXiv Detail & Related papers (2021-02-27T18:13:48Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - DINE: A Framework for Deep Incomplete Network Embedding [33.97952453310253]
We propose a Deep Incomplete Network Embedding method, namely DINE.
We first complete the missing part including both nodes and edges in a partially observable network by using the expectation-maximization framework.
We evaluate DINE over three networks on multi-label classification and link prediction tasks.
arXiv Detail & Related papers (2020-08-09T04:59:35Z) - Hierarchical Bi-Directional Feature Perception Network for Person
Re-Identification [12.259747100939078]
Previous Person Re-Identification (Re-ID) models aim to focus on the most discriminative region of an image.
We propose a novel model named Hierarchical Bi-directional Feature Perception Network (HBFP-Net) to correlate multi-level information and reinforce each other.
Experiments implemented on the mainstream evaluation including Market-1501, CUHK03 and DukeMTMC-ReID datasets show that our method outperforms the recent SOTA Re-ID models.
arXiv Detail & Related papers (2020-08-08T12:33:32Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z) - Gated Path Selection Network for Semantic Segmentation [72.44994579325822]
We develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields.
In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields.
To dynamically select desirable semantic context, a gate prediction module is further introduced.
arXiv Detail & Related papers (2020-01-19T12:32:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.