Related papers: Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits

Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits

URL: http://arxiv.org/abs/2302.01888v1
Date: Fri, 3 Feb 2023 17:53:40 GMT
Title: Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits
Authors: Simone Sarti, Eugenio Lomurno, Andrea Falanti, Matteo Matteucci
Abstract summary: Once-For-All (OFA) is an eco-friendly algorithm characterised by the ability to generate easily adaptable models. OFA is improved from an architectural point of view by including early exits, parallel blocks and dense skip connections. OFAAv2 improves its accuracy performance on the Tiny ImageNet dataset by up to 12.07% compared to the original version of OFA.
Score: 7.0895962209555465
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The use of Neural Architecture Search (NAS) techniques to automate the design of neural networks has become increasingly popular in recent years. The proliferation of devices with different hardware characteristics using such neural networks, as well as the need to reduce the power consumption for their search, has led to the realisation of Once-For-All (OFA), an eco-friendly algorithm characterised by the ability to generate easily adaptable models through a single learning process. In order to improve this paradigm and develop high-performance yet eco-friendly NAS techniques, this paper presents OFAv2, the extension of OFA aimed at improving its performance while maintaining the same ecological advantage. The algorithm is improved from an architectural point of view by including early exits, parallel blocks and dense skip connections. The training process is extended by two new phases called Elastic Level and Elastic Height. A new Knowledge Distillation technique is presented to handle multi-output networks, and finally a new strategy for dynamic teacher network selection is proposed. These modifications allow OFAv2 to improve its accuracy performance on the Tiny ImageNet dataset by up to 12.07% compared to the original version of OFA, while maintaining the algorithm flexibility and advantages.

Related papers

Combining Neural Architecture Search and Automatic Code Optimization: A Survey [0.8796261172196743]
Two notable techniques are Hardware-aware Neural Architecture Search (HW-NAS) and Automatic Code Optimization (ACO) HW-NAS automatically designs accurate yet hardware-friendly neural networks, while ACO involves searching for the best compiler optimizations to apply on neural networks. This survey explores recent works that combine these two techniques within a single framework.
arXiv Detail & Related papers (2024-08-07T22:40:05Z)
SONATA: Self-adaptive Evolutionary Framework for Hardware-aware Neural Architecture Search [0.7646713951724011]
HW-aware Neural Architecture Search (HW-aware NAS) emerges as an attractive strategy to automate the design of NN. We propose SONATA, a self-adaptive evolutionary algorithm for HW-aware NAS. Our method leverages adaptive evolutionary operators guided by the learned importance of NN design parameters.
arXiv Detail & Related papers (2024-02-20T18:15:11Z)
Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in Multi-Objective Neural Architecture Search [7.967995669387532]
We present NATv2, an extension of Neural Architecture Transfer (NAT) that improves multi-objective search algorithms. NATv2 achieves qualitative improvements in the extractable sub-networks by exploiting the improved super-networks.
arXiv Detail & Related papers (2023-07-03T12:25:09Z)
Multi-agent Reinforcement Learning with Graph Q-Networks for Antenna Tuning [60.94661435297309]
The scale of mobile networks makes it challenging to optimize antenna parameters using manual intervention or hand-engineered strategies. We propose a new multi-agent reinforcement learning algorithm to optimize mobile network configurations globally. We empirically demonstrate the performance of the algorithm on an antenna tilt tuning problem and a joint tilt and power control problem in a simulated environment.
arXiv Detail & Related papers (2023-01-20T17:06:34Z)
Receptive Field Refinement for Convolutional Neural Networks Reliably Improves Predictive Performance [1.52292571922932]
We present a new approach to receptive field analysis that can yield these types of theoretical and empirical performance gains. Our approach is able to improve ImageNet1K performance across a wide range of well-known, state-of-the-art (SOTA) model classes.
arXiv Detail & Related papers (2022-11-26T05:27:44Z)
Neural Networks with A La Carte Selection of Activation Functions [0.0]
Activation functions (AFs) are pivotal to the success (or failure) of a neural network. We combine a slew of known AFs into successful architectures, proposing three methods to do so beneficially. We show that all methods often produce significantly better results for 25 classification problems when compared with a standard network composed of ReLU hidden units and a softmax output unit.
arXiv Detail & Related papers (2022-06-24T09:09:39Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
CondenseNet V2: Sparse Feature Reactivation for Deep Networks [87.38447745642479]
Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency. We propose an alternative approach named sparse feature reactivation (SFR), aiming at actively increasing the utility of features for reusing. Our experiments show that the proposed models achieve promising performance on image classification (ImageNet and CIFAR) and object detection (MS COCO) in terms of both theoretical efficiency and practical speed.
arXiv Detail & Related papers (2021-04-09T14:12:43Z)
Efficient Feature Transformations for Discriminative and Generative Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning. Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture. We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.