Exploiting Features with Split-and-Share Module
- URL: http://arxiv.org/abs/2108.04500v2
- Date: Wed, 11 Aug 2021 00:34:07 GMT
- Title: Exploiting Features with Split-and-Share Module
- Authors: Jaemin Lee, Minseok Seo, Jongchan Park, Dong-Geol Choi
- Abstract summary: Split-and-Share Module (SSM) splits a given feature into parts, which are partially shared by multiple sub-classifiers.
SSM can be easily integrated into any architecture without bells and whistles.
We have extensively validated the efficacy of SSM on ImageNet-1K classification task.
- Score: 6.245453620070586
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep convolutional neural networks (CNNs) have shown state-of-the-art
performances in various computer vision tasks. Advances on CNN architectures
have focused mainly on designing convolutional blocks of the feature
extractors, but less on the classifiers that exploit extracted features. In
this work, we propose Split-and-Share Module (SSM),a classifier that splits a
given feature into parts, which are partially shared by multiple
sub-classifiers. Our intuition is that the more the features are shared, the
more common they will become, and SSM can encourage such structural
characteristics in the split features. SSM can be easily integrated into any
architecture without bells and whistles. We have extensively validated the
efficacy of SSM on ImageNet-1K classification task, andSSM has shown consistent
and significant improvements over baseline architectures. In addition, we
analyze the effect of SSM using the Grad-CAM visualization.
Related papers
- Investigation of Hierarchical Spectral Vision Transformer Architecture for Classification of Hyperspectral Imagery [7.839253919389809]
The theoretical justification for vision Transformers out-performing CNN architectures in HSI classification remains a question.
A unified hierarchical spectral vision Transformer architecture, specifically tailored for HSI classification is investigated.
It is concluded that the unique strength of vision Transformers can be attributed to their overarching architecture.
arXiv Detail & Related papers (2024-09-14T00:53:13Z) - Brain-Inspired Stepwise Patch Merging for Vision Transformers [6.108377966393714]
We propose a novel technique called Stepwise Patch Merging (SPM), which enhances the subsequent attention mechanism's ability to'see' better.
Extensive experiments conducted on benchmark datasets, including ImageNet-1K, COCO, and ADE20K, demonstrate that SPM significantly improves the performance of various models.
arXiv Detail & Related papers (2024-09-11T03:04:46Z) - Demystify Transformers & Convolutions in Modern Image Deep Networks [82.32018252867277]
This paper aims to identify the real gains of popular convolution and attention operators through a detailed study.
We find that the key difference among these feature transformation modules, such as attention or convolution, lies in their spatial feature aggregation approach.
Our experiments on various tasks and an analysis of inductive bias show a significant performance boost due to advanced network-level and block-level designs.
arXiv Detail & Related papers (2022-11-10T18:59:43Z) - Deep Image Clustering with Contrastive Learning and Multi-scale Graph
Convolutional Networks [58.868899595936476]
This paper presents a new deep clustering approach termed image clustering with contrastive learning and multi-scale graph convolutional networks (IcicleGCN)
Experiments on multiple image datasets demonstrate the superior clustering performance of IcicleGCN over the state-of-the-art.
arXiv Detail & Related papers (2022-07-14T19:16:56Z) - Towards efficient feature sharing in MIMO architectures [102.40140369542755]
Multi-input multi-output architectures propose to train multipleworks within one base network and then average the subnetwork predictions to benefit from ensembling for free.
Despite some relative success, these architectures are wasteful in their use of parameters.
We highlight in this paper that the learned subnetwork fail to share even generic features which limits their applicability on smaller mobile and AR/VR devices.
arXiv Detail & Related papers (2022-05-20T12:33:34Z) - Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition.
We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction.
We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z) - Specificity-preserving RGB-D Saliency Detection [103.3722116992476]
We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T14:14:22Z) - DMSANet: Dual Multi Scale Attention Network [0.0]
We propose a new attention module that not only achieves the best performance but also has lesser parameters compared to most existing models.
Our attention module can easily be integrated with other convolutional neural networks because of its lightweight nature.
arXiv Detail & Related papers (2021-06-13T10:31:31Z) - ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations.
Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z) - Group Based Deep Shared Feature Learning for Fine-grained Image
Classification [31.84610555517329]
We present a new deep network architecture that explicitly models shared features and removes their effect to achieve enhanced classification results.
We call this framework Group based deep Shared Feature Learning (GSFL) and the resulting learned network as GSFL-Net.
A key benefit of our specialized autoencoder is that it is versatile and can be combined with state-of-the-art fine-grained feature extraction models and trained together with them to improve their performance directly.
arXiv Detail & Related papers (2020-04-04T00:01:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.