FocusFormer: Focusing on What We Need via Architecture Sampler
- URL: http://arxiv.org/abs/2208.10861v1
- Date: Tue, 23 Aug 2022 10:42:56 GMT
- Title: FocusFormer: Focusing on What We Need via Architecture Sampler
- Authors: Jing Liu, Jianfei Cai, Bohan Zhuang
- Abstract summary: Vision Transformers (ViTs) have underpinned the recent breakthroughs in computer vision.
One-shot neural architecture search decouples the supernet training and architecture specialization for diverse deployment scenarios.
We devise a simple yet effective method, called FocusFormer, to bridge such a gap.
- Score: 45.150346855368
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision Transformers (ViTs) have underpinned the recent breakthroughs in
computer vision. However, designing the architectures of ViTs is laborious and
heavily relies on expert knowledge. To automate the design process and
incorporate deployment flexibility, one-shot neural architecture search
decouples the supernet training and architecture specialization for diverse
deployment scenarios. To cope with an enormous number of sub-networks in the
supernet, existing methods treat all architectures equally important and
randomly sample some of them in each update step during training. During
architecture search, these methods focus on finding architectures on the Pareto
frontier of performance and resource consumption, which forms a gap between
training and deployment. In this paper, we devise a simple yet effective
method, called FocusFormer, to bridge such a gap. To this end, we propose to
learn an architecture sampler to assign higher sampling probabilities to those
architectures on the Pareto frontier under different resource constraints
during supernet training, making them sufficiently optimized and hence
improving their performance. During specialization, we can directly use the
well-trained architecture sampler to obtain accurate architectures satisfying
the given resource constraint, which significantly improves the search
efficiency. Extensive experiments on CIFAR-100 and ImageNet show that our
FocusFormer is able to improve the performance of the searched architectures
while significantly reducing the search cost. For example, on ImageNet, our
FocusFormer-Ti with 1.4G FLOPs outperforms AutoFormer-Ti by 0.5% in terms of
the Top-1 accuracy.
Related papers
- EM-DARTS: Hierarchical Differentiable Architecture Search for Eye Movement Recognition [54.99121380536659]
Eye movement biometrics have received increasing attention thanks to its high secure identification.
Deep learning (DL) models have been recently successfully applied for eye movement recognition.
DL architecture still is determined by human prior knowledge.
We propose EM-DARTS, a hierarchical differentiable architecture search algorithm to automatically design the DL architecture for eye movement recognition.
arXiv Detail & Related papers (2024-09-22T13:11:08Z) - Building Optimal Neural Architectures using Interpretable Knowledge [15.66288233048004]
AutoBuild is a scheme which learns to align the latent embeddings of operations and architecture modules with the ground-truth performance of the architectures they appear in.
We show that by mining a relatively small set of evaluated architectures, AutoBuild can learn to build high-quality architectures directly or help to reduce search space to focus on relevant areas.
arXiv Detail & Related papers (2024-03-20T04:18:38Z) - POPNASv3: a Pareto-Optimal Neural Architecture Search Solution for Image
and Time Series Classification [8.190723030003804]
This article presents the third version of a sequential model-based NAS algorithm targeting different hardware environments and multiple classification tasks.
Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks.
The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
arXiv Detail & Related papers (2022-12-13T17:14:14Z) - Pareto-aware Neural Architecture Generation for Diverse Computational
Budgets [94.27982238384847]
Existing methods often perform an independent architecture search process for each target budget.
We propose a Neural Architecture Generator (PNAG) which only needs to be trained once and dynamically produces the optimal architecture for any given budget via inference.
Such a joint search algorithm not only greatly reduces the overall search cost but also improves the results.
arXiv Detail & Related papers (2022-10-14T08:30:59Z) - Pruning-as-Search: Efficient Neural Architecture Search via Channel
Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.
Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z) - Elastic Architecture Search for Diverse Tasks with Different Resources [87.23061200971912]
We study a new challenging problem of efficient deployment for diverse tasks with different resources, where the resource constraint and task of interest corresponding to a group of classes are dynamically specified at testing time.
Previous NAS approaches seek to design architectures for all classes simultaneously, which may not be optimal for some individual tasks.
We present a novel and general framework, called Elastic Architecture Search (EAS), permitting instant specializations at runtime for diverse tasks with various resource constraints.
arXiv Detail & Related papers (2021-08-03T00:54:27Z) - CHASE: Robust Visual Tracking via Cell-Level Differentiable Neural
Architecture Search [14.702573109803307]
We propose a novel cell-level differentiable architecture search mechanism to automate the network design of the tracking module.
The proposed approach is simple, efficient, and with no need to stack a series of modules to construct a network.
Our approach is easy to be incorporated into existing trackers, which is empirically validated using different differentiable architecture search-based methods and tracking objectives.
arXiv Detail & Related papers (2021-07-02T15:16:45Z) - Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications.
These networks consist of stages, which are sets of layers that operate on representations in the same resolution.
It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network.
However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.