Densely Nested Top-Down Flows for Salient Object Detection
- URL: http://arxiv.org/abs/2102.09133v1
- Date: Thu, 18 Feb 2021 03:14:02 GMT
- Title: Densely Nested Top-Down Flows for Salient Object Detection
- Authors: Chaowei Fang, Haibin Tian, Dingwen Zhang, Qiang Zhang, Jungong Han,
Junwei Han
- Abstract summary: This paper revisits the role of top-down modeling in salient object detection.
It designs a novel densely nested top-down flows (DNTDF)-based framework.
In every stage of DNTDF, features from higher levels are read in via the progressive compression shortcut paths (PCSP)
- Score: 137.74130900326833
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: With the goal of identifying pixel-wise salient object regions from each
input image, salient object detection (SOD) has been receiving great attention
in recent years. One kind of mainstream SOD methods is formed by a bottom-up
feature encoding procedure and a top-down information decoding procedure. While
numerous approaches have explored the bottom-up feature extraction for this
task, the design on top-down flows still remains under-studied. To this end,
this paper revisits the role of top-down modeling in salient object detection
and designs a novel densely nested top-down flows (DNTDF)-based framework. In
every stage of DNTDF, features from higher levels are read in via the
progressive compression shortcut paths (PCSP). The notable characteristics of
our proposed method are as follows. 1) The propagation of high-level features
which usually have relatively strong semantic information is enhanced in the
decoding procedure; 2) With the help of PCSP, the gradient vanishing issues
caused by non-linear operations in top-down information flows can be
alleviated; 3) Thanks to the full exploration of high-level features, the
decoding process of our method is relatively memory efficient compared against
those of existing methods. Integrating DNTDF with EfficientNet, we construct a
highly light-weighted SOD model, with very low computational complexity. To
demonstrate the effectiveness of the proposed model, comprehensive experiments
are conducted on six widely-used benchmark datasets. The comparisons to the
most state-of-the-art methods as well as the carefully-designed baseline models
verify our insights on the top-down flow modeling for SOD. The code of this
paper is available at https://github.com/new-stone-object/DNTD.
Related papers
- PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives.
To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD.
All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z) - Out-of-distribution detection based on subspace projection of high-dimensional features output by the last convolutional layer [5.902332693463877]
This paper concentrates on the high-dimensional features output by the final convolutional layer, which contain rich image features.
Our key idea is to project these high-dimensional features into two specific feature subspaces, trained with Predefined Evenly-Distribution Class Centroids (PEDCC)-Loss.
Our method requires only the training of the classification network model, eschewing any need for input pre-processing or specific OOD data pre-tuning.
arXiv Detail & Related papers (2024-05-02T18:33:02Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - LocATe: End-to-end Localization of Actions in 3D with Transformers [91.28982770522329]
LocATe is an end-to-end approach that jointly localizes and recognizes actions in a 3D sequence.
Unlike transformer-based object-detection and classification models which consider image or patch features as input, LocATe's transformer model is capable of capturing long-term correlations between actions in a sequence.
We introduce a new, challenging, and more realistic benchmark dataset, BABEL-TAL-20 (BT20), where the performance of state-of-the-art methods is significantly worse.
arXiv Detail & Related papers (2022-03-21T03:35:32Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Occlusion-Robust Object Pose Estimation with Holistic Representation [42.27081423489484]
State-of-the-art (SOTA) object pose estimators take a two-stage approach.
We develop a novel occlude-and-blackout batch augmentation technique.
We also develop a multi-precision supervision architecture to encourage holistic pose representation learning.
arXiv Detail & Related papers (2021-10-22T08:00:26Z) - BiconNet: An Edge-preserved Connectivity-based Approach for Salient
Object Detection [3.3517146652431378]
We show that our model can use any existing saliency-based SOD framework as its backbone.
Through comprehensive experiments on five benchmark datasets, we demonstrate that our proposed method outperforms state-of-the-art SOD approaches.
arXiv Detail & Related papers (2021-02-27T21:39:04Z) - EDN: Salient Object Detection via Extremely-Downsampled Network [66.38046176176017]
We introduce an Extremely-Downsampled Network (EDN), which employs an extreme downsampling technique to effectively learn a global view of the whole image.
Experiments demonstrate that EDN achieves sArt performance with real-time speed.
arXiv Detail & Related papers (2020-12-24T04:23:48Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.