Towards efficient feature sharing in MIMO architectures
- URL: http://arxiv.org/abs/2205.10139v1
- Date: Fri, 20 May 2022 12:33:34 GMT
- Title: Towards efficient feature sharing in MIMO architectures
- Authors: R\'emy Sun, Alexandre Ram\'e, Cl\'ement Masson, Nicolas Thome,
Matthieu Cord
- Abstract summary: Multi-input multi-output architectures propose to train multipleworks within one base network and then average the subnetwork predictions to benefit from ensembling for free.
Despite some relative success, these architectures are wasteful in their use of parameters.
We highlight in this paper that the learned subnetwork fail to share even generic features which limits their applicability on smaller mobile and AR/VR devices.
- Score: 102.40140369542755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-input multi-output architectures propose to train multiple subnetworks
within one base network and then average the subnetwork predictions to benefit
from ensembling for free. Despite some relative success, these architectures
are wasteful in their use of parameters. Indeed, we highlight in this paper
that the learned subnetwork fail to share even generic features which limits
their applicability on smaller mobile and AR/VR devices. We posit this behavior
stems from an ill-posed part of the multi-input multi-output framework. To
solve this issue, we propose a novel unmixing step in MIMO architectures that
allows subnetworks to properly share features. Preliminary experiments on
CIFAR-100 show our adjustments allow feature sharing and improve model
performance for small architectures.
Related papers
- AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation [48.82264764771652]
We introduce AsCAN -- a hybrid architecture, combining both convolutional and transformer blocks.
AsCAN supports a variety of tasks: recognition, segmentation, class-conditional image generation.
We then scale the same architecture to solve a large-scale text-to-image task and show state-of-the-art performance.
arXiv Detail & Related papers (2024-11-07T18:43:17Z) - Multi-objective Differentiable Neural Architecture Search [58.67218773054753]
We propose a novel NAS algorithm that encodes user preferences for the trade-off between performance and hardware metrics.
Our method outperforms existing MOO NAS methods across a broad range of qualitatively different search spaces and datasets.
arXiv Detail & Related papers (2024-02-28T10:09:04Z) - DRESS: Dynamic REal-time Sparse Subnets [7.76526807772015]
We propose a novel training algorithm, Dynamic REal-time Sparse Subnets (DRESS)
DRESS samples multiple sub-networks from the same backbone network through row-based unstructured sparsity, and jointly trains these sub-networks in parallel with weighted loss.
Experiments on public vision datasets show that DRESS yields significantly higher accuracy than state-of-the-art sub-networks.
arXiv Detail & Related papers (2022-07-01T22:05:07Z) - Build a Robust QA System with Transformer-based Mixture of Experts [0.29005223064604074]
We build a robust question answering system that can adapt to out-of-domain datasets.
We show that our combination of best architecture and data augmentation techniques achieves a 53.477 F1 score in the out-of-domain evaluation.
arXiv Detail & Related papers (2022-03-20T02:38:29Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - Exploiting Features with Split-and-Share Module [6.245453620070586]
Split-and-Share Module (SSM) splits a given feature into parts, which are partially shared by multiple sub-classifiers.
SSM can be easily integrated into any architecture without bells and whistles.
We have extensively validated the efficacy of SSM on ImageNet-1K classification task.
arXiv Detail & Related papers (2021-08-10T08:11:26Z) - Multi-path Neural Networks for On-device Multi-domain Visual
Classification [55.281139434736254]
This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices.
The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space.
The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths.
arXiv Detail & Related papers (2020-10-10T05:13:49Z) - Boosting Share Routing for Multi-task Learning [0.12891210250935145]
Multi-task learning (MTL) aims to make full use of the knowledge contained in multi-task supervision signals to improve the overall performance.
How to make the knowledge of multiple tasks shared appropriately is an open problem for MTL.
We propose a general framework called Multi-Task Neural Architecture Search (MTNAS) to efficiently find a suitable sharing route for a given MTL problem.
arXiv Detail & Related papers (2020-09-01T12:37:19Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.