Disentangled High Quality Salient Object Detection
- URL: http://arxiv.org/abs/2108.03551v1
- Date: Sun, 8 Aug 2021 02:14:15 GMT
- Title: Disentangled High Quality Salient Object Detection
- Authors: Lv Tang, Bo Li, Shouhong Ding, Mofei Song
- Abstract summary: We propose a novel deep learning framework for high-resolution salient object detection (SOD)
It disentangles the task into a low-resolution saliency classification network (LRSCN) and a high-resolution refinement network (HRRN)
- Score: 8.416690566816305
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Aiming at discovering and locating most distinctive objects from visual
scenes, salient object detection (SOD) plays an essential role in various
computer vision systems. Coming to the era of high resolution, SOD methods are
facing new challenges. The major limitation of previous methods is that they
try to identify the salient regions and estimate the accurate objects
boundaries simultaneously with a single regression task at low-resolution. This
practice ignores the inherent difference between the two difficult problems,
resulting in poor detection quality. In this paper, we propose a novel deep
learning framework for high-resolution SOD task, which disentangles the task
into a low-resolution saliency classification network (LRSCN) and a
high-resolution refinement network (HRRN). As a pixel-wise classification task,
LRSCN is designed to capture sufficient semantics at low-resolution to identify
the definite salient, background and uncertain image regions. HRRN is a
regression task, which aims at accurately refining the saliency value of pixels
in the uncertain region to preserve a clear object boundary at high-resolution
with limited GPU memory. It is worth noting that by introducing uncertainty
into the training process, our HRRN can well address the high-resolution
refinement task without using any high-resolution training data. Extensive
experiments on high-resolution saliency datasets as well as some widely used
saliency benchmarks show that the proposed method achieves superior performance
compared to the state-of-the-art methods.
Related papers
- Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Recurrent Multi-scale Transformer for High-Resolution Salient Object
Detection [68.65338791283298]
Salient Object Detection (SOD) aims to identify and segment the most conspicuous objects in an image or video.
Traditional SOD methods are largely limited to low-resolution images, making them difficult to adapt to the development of High-Resolution SOD.
In this work, we first propose a new HRS10K dataset, which contains 10,500 high-quality annotated images at 2K-8K resolution.
arXiv Detail & Related papers (2023-08-07T17:49:04Z) - One-stage Low-resolution Text Recognition with High-resolution Knowledge
Transfer [53.02254290682613]
Current solutions for low-resolution text recognition typically rely on a two-stage pipeline.
We propose an efficient and effective knowledge distillation framework to achieve multi-level knowledge transfer.
Experiments show that the proposed one-stage pipeline significantly outperforms super-resolution based two-stage frameworks.
arXiv Detail & Related papers (2023-08-05T02:33:45Z) - Cross-resolution Face Recognition via Identity-Preserving Network and
Knowledge Distillation [12.090322373964124]
Cross-resolution face recognition is a challenging problem for modern deep face recognition systems.
This paper proposes a new approach that enforces the network to focus on the discriminative information stored in the low-frequency components of a low-resolution image.
arXiv Detail & Related papers (2023-03-15T14:52:46Z) - Pyramid Grafting Network for One-Stage High Resolution Saliency
Detection [29.013012579688347]
We propose a one-stage framework called Pyramid Grafting Network (PGNet) to extract features from different resolution images independently.
An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically.
We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions.
arXiv Detail & Related papers (2022-04-11T12:22:21Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Multi-Scale Aligned Distillation for Low-Resolution Detection [68.96325141432078]
This paper focuses on boosting the performance of low-resolution models by distilling knowledge from a high- or multi-resolution model.
On several instance-level detection tasks and datasets, the low-resolution models trained via our approach perform competitively with high-resolution models trained via conventional multi-scale training.
arXiv Detail & Related papers (2021-09-14T12:53:35Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Multi Scale Identity-Preserving Image-to-Image Translation Network for
Low-Resolution Face Recognition [7.6702700993064115]
We propose an identity-preserving end-to-end image-to-image translation deep neural network.
It is capable of super-resolving very low-resolution faces to their high-resolution counterparts while preserving identity-related information.
arXiv Detail & Related papers (2020-10-23T09:21:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.