Embedded Self-Distillation in Compact Multi-Branch Ensemble Network for
Remote Sensing Scene Classification
- URL: http://arxiv.org/abs/2104.00222v1
- Date: Thu, 1 Apr 2021 03:08:52 GMT
- Title: Embedded Self-Distillation in Compact Multi-Branch Ensemble Network for
Remote Sensing Scene Classification
- Authors: Qi Zhao, Yujing Ma, Shuchang Lyu, Lijiang Chen
- Abstract summary: We propose a multi-branch ensemble network to enhance the feature representation ability.
We embed self-distillation (SD) method to transfer knowledge from ensemble network to main-branch in it.
Results prove that our proposed ESD-MBENet can achieve better accuracy than previous state-of-the-art (SOTA) complex models.
- Score: 17.321718779142817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote sensing (RS) image scene classification task faces many challenges due
to the interference from different characteristics of different geographical
elements. To solve this problem, we propose a multi-branch ensemble network to
enhance the feature representation ability by fusing features in final output
logits and intermediate feature maps. However, simply adding branches will
increase the complexity of models and decline the inference efficiency. On this
issue, we embed self-distillation (SD) method to transfer knowledge from
ensemble network to main-branch in it. Through optimizing with SD, main-branch
will have close performance as ensemble network. During inference, we can cut
other branches to simplify the whole model. In this paper, we first design
compact multi-branch ensemble network, which can be trained in an end-to-end
manner. Then, we insert SD method on output logits and feature maps. Compared
to previous methods, our proposed architecture (ESD-MBENet) performs strongly
on classification accuracy with compact design. Extensive experiments are
applied on three benchmark RS datasets AID, NWPU-RESISC45 and UC-Merced with
three classic baseline models, VGG16, ResNet50 and DenseNet121. Results prove
that our proposed ESD-MBENet can achieve better accuracy than previous
state-of-the-art (SOTA) complex models. Moreover, abundant visualization
analysis make our method more convincing and interpretable.
Related papers
- DiTMoS: Delving into Diverse Tiny-Model Selection on Microcontrollers [34.282971510732736]
We introduce DiTMoS, a novel DNN training and inference framework with a selector-classifiers architecture.
A composition of weak models can exhibit high diversity and the union of them can significantly boost the accuracy upper bound.
We deploy DiTMoS on the Neucleo STM32F767ZI board and evaluate it based on three time-series datasets for human activity recognition, keywords spotting, and emotion recognition.
arXiv Detail & Related papers (2024-03-14T02:11:38Z) - SODAWideNet -- Salient Object Detection with an Attention augmented Wide
Encoder Decoder network without ImageNet pre-training [3.66237529322911]
We explore developing a neural network from scratch directly trained on Salient Object Detection without ImageNet pre-training.
We propose SODAWideNet, an encoder-decoder-style network for Salient Object Detection.
Two variants, SODAWideNet-S (3.03M) and SODAWideNet (9.03M), achieve competitive performance against state-of-the-art models on five datasets.
arXiv Detail & Related papers (2023-11-08T16:53:44Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - Self-distilled Feature Aggregation for Self-supervised Monocular Depth
Estimation [11.929584800629673]
We propose the Self-Distilled Feature Aggregation (SDFA) module for simultaneously aggregating a pair of low-scale and high-scale features.
We propose an SDFA-based network for self-supervised monocular depth estimation, and design a self-distilled training strategy to train the proposed network.
Experimental results on the KITTI dataset demonstrate that the proposed method outperforms the comparative state-of-the-art methods in most cases.
arXiv Detail & Related papers (2022-09-15T07:00:52Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo
Matching Networks [3.7384509727711923]
We introduce a pairwise feature for deep stereo matching networks, named LSP (Local Similarity Pattern)
Through explicitly revealing the neighbor relationships, LSP contains rich structural information, which can be leveraged to aid for more discriminative feature description.
Secondly, we design a dynamic self-reassembling refinement strategy and apply it to the cost distribution and the disparity map respectively.
arXiv Detail & Related papers (2021-12-02T06:52:54Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations.
Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z) - Searching Central Difference Convolutional Networks for Face
Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems.
Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network.
Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.