Related papers: a novel attention-based network for fast salient object detection

a novel attention-based network for fast salient object detection

URL: http://arxiv.org/abs/2112.10481v1
Date: Mon, 20 Dec 2021 12:30:20 GMT
Title: a novel attention-based network for fast salient object detection
Authors: Bin Zhang, Yang Wu, Xiaojing Zhang and Ming Ma
Abstract summary: In the current salient object detection network, the most popular method is using U-shape structure. We propose a new deep convolution network architecture with three contributions. Results demonstrate that the proposed method can compress the model to 1/3 of the original size nearly without losing the accuracy.
Score: 14.246237737452105
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the current salient object detection network, the most popular method is using U-shape structure. However, the massive number of parameters leads to more consumption of computing and storage resources which are not feasible to deploy on the limited memory device. Some others shallow layer network will not maintain the same accuracy compared with U-shape structure and the deep network structure with more parameters will not converge to a global minimum loss with great speed. To overcome all of these disadvantages, we proposed a new deep convolution network architecture with three contributions: (1) using smaller convolution neural networks (CNNs) to compress the model in our improved salient object features compression and reinforcement extraction module (ISFCREM) to reduce parameters of the model. (2) introducing channel attention mechanism in ISFCREM to weigh different channels for improving the ability of feature representation. (3) applying a new optimizer to accumulate the long-term gradient information during training to adaptively tune the learning rate. The results demonstrate that the proposed method can compress the model to 1/3 of the original size nearly without losing the accuracy and converging faster and more smoothly on six widely used datasets of salient object detection compared with the others models. Our code is published in https://gitee.com/binzhangbinzhangbin/code-a-novel-attention-based-network-for-fast-salient-object-d etection.git

Related papers

MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning [7.262751938473306]
Pruning is a well-established technique that reduces the size of neural networks while mathematically guaranteeing accuracy preservation. We develop a new pruning algorithm, MPruner, that leverages mutual information through vector similarity. MPruner achieved up to a 50% reduction in parameters and memory usage for CNN and transformer-based models, with minimal to no loss in accuracy.
arXiv Detail & Related papers (2024-08-24T05:54:47Z)
SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method. We distribute features of space-time tubes evenly across a limited number of learnable clusters. Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z)
SAR Despeckling Using Overcomplete Convolutional Networks [53.99620005035804]
despeckling is an important problem in remote sensing as speckle degrades SAR images. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field. We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images.
arXiv Detail & Related papers (2022-05-31T15:55:37Z)
Tied & Reduced RNN-T Decoder [0.0]
We study ways to make the RNN-T decoder (prediction network + joint network) smaller and faster without degradation in recognition performance. Our prediction network performs a simple weighted averaging of the input embeddings, and shares its embedding matrix weights with the joint network's output layer. This simple design, when used in conjunction with additional Edit-based Minimum Bayes Risk (EMBR) training, reduces the RNN-T Decoder from 23M parameters to just 2M, without affecting word-error rate (WER)
arXiv Detail & Related papers (2021-09-15T18:19:16Z)
Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures. We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels. Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z)
AGSFCOS: Based on attention mechanism and Scale-Equalizing pyramid network of object detection [10.824032219531095]
Our model has a certain improvement in accuracy compared with the current popular detection models on the COCO dataset. Our optimal model can get 39.5% COCO AP under the background of ResNet50.
arXiv Detail & Related papers (2021-05-20T08:41:02Z)
Spatio-Temporal Inception Graph Convolutional Networks for Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition. Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z)
A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition [9.414818018857316]
We propose a method to effectively compress Recurrent Neural Networks (RNNs) used for Human Action Recognition (HAR) We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset. We combine our pruning method with a specific group-lasso regularization technique that significantly improves compression. It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.
arXiv Detail & Related papers (2020-10-03T12:41:51Z)
Ensembled sparse-input hierarchical networks for high-dimensional datasets [8.629912408966145]
We show that dense neural networks can be a practical data analysis tool in settings with small sample sizes. A proposed method appropriately prunes the network structure by tuning only two L1-penalty parameters. On a collection of real-world datasets with different sizes, EASIER-net selected network architectures in a data-adaptive manner and achieved higher prediction accuracy than off-the-shelf methods on average.
arXiv Detail & Related papers (2020-05-11T02:08:53Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection. The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.