Related papers: SuperSAM: Crafting a SAM Supernetwork via Structured Pruning and Unstructured Parameter Prioritization

SuperSAM: Crafting a SAM Supernetwork via Structured Pruning and Unstructured Parameter Prioritization

URL: http://arxiv.org/abs/2501.08504v1
Date: Wed, 15 Jan 2025 00:54:12 GMT
Title: SuperSAM: Crafting a SAM Supernetwork via Structured Pruning and Unstructured Parameter Prioritization
Authors: Waqwoya Abebe, Sadegh Jafari, Sixing Yu, Akash Dutta, Jan Strube, Nathan R. Tallent, Luanzheng Guo, Pablo Munoz, Ali Jannesari,
Abstract summary: We propose a search space design strategy for Vision Transformer (ViT)-based architectures.<n>In particular, we convert the Segment Anything Model (SAM) into a weight-sharing supernetwork called SuperSAM.<n>Our approach involves automating the search space design via layer-wise structured pruning and parameter prioritization.
Score: 6.8331250697000865
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural Architecture Search (NAS) is a powerful approach of automating the design of efficient neural architectures. In contrast to traditional NAS methods, recently proposed one-shot NAS methods prove to be more efficient in performing NAS. One-shot NAS works by generating a singular weight-sharing supernetwork that acts as a search space (container) of subnetworks. Despite its achievements, designing the one-shot search space remains a major challenge. In this work we propose a search space design strategy for Vision Transformer (ViT)-based architectures. In particular, we convert the Segment Anything Model (SAM) into a weight-sharing supernetwork called SuperSAM. Our approach involves automating the search space design via layer-wise structured pruning and parameter prioritization. While the structured pruning applies probabilistic removal of certain transformer layers, parameter prioritization performs weight reordering and slicing of MLP-blocks in the remaining layers. We train supernetworks on several datasets using the sandwich rule. For deployment, we enhance subnetwork discovery by utilizing a program autotuner to identify efficient subnetworks within the search space. The resulting subnetworks are 30-70% smaller in size compared to the original pre-trained SAM ViT-B, yet outperform the pretrained model. Our work introduces a new and effective method for ViT NAS search-space design.

Related papers

Efficient Few-Shot Neural Architecture Search by Counting the Number of Nonlinear Functions [29.76210308781724]
We introduce a novel few-shot NAS method that exploits the number of nonlinear functions to split the search space.<n>Our method is efficient, since it does not require comparing gradients of a supernet to split the space.<n>In addition, we have found that dividing the space allows us to reduce the channel dimensions required for each supernet.
arXiv Detail & Related papers (2024-12-19T09:31:53Z)
DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms. Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z)
$\alpha$NAS: Neural Architecture Search using Property Guided Synthesis [1.2746672439030722]
We develop techniques that enable efficient neural architecture search (NAS) in a significantly larger design space. Our key insights are as follows: (1) the abstract search space is significantly smaller than the original search space, and (2) architectures with similar program properties also have similar performance. We implement our approach, $alpha$NAS, within an evolutionary framework, where the mutations are guided by the program properties.
arXiv Detail & Related papers (2022-05-08T21:48:03Z)
Evolutionary Neural Cascade Search across Supernetworks [68.8204255655161]
We introduce ENCAS - Evolutionary Neural Cascade Search. ENCAS can be used to search over multiple pretrained supernetworks. We test ENCAS on common computer vision benchmarks.
arXiv Detail & Related papers (2022-03-08T11:06:01Z)
UENAS: A Unified Evolution-based NAS Framework [11.711427415684955]
UENAS is an evolution-based NAS framework with a broader search space. We propose three strategies to alleviate the huge search cost caused by the expanded search space. UENAS achieves error rates of 2.81% on CIFAR-10, 20.24% on CIFAR-100, and 33% on Tiny-ImageNet.
arXiv Detail & Related papers (2022-03-08T09:14:37Z)
Enabling NAS with Automated Super-Network Generation [60.72821429802335]
Recent Neural Architecture Search (NAS) solutions have produced impressive results training super-networks and then derivingworks. We present BootstrapNAS, a software framework for automatic generation of super-networks for NAS.
arXiv Detail & Related papers (2021-12-20T21:45:48Z)
Connection Sensitivity Matters for Training-free DARTS: From Architecture-Level Scoring to Operation-Level Sensitivity Analysis [32.94768616851585]
Recently proposed training-free NAS methods abandon the training phase and design various zero-cost proxies as scores to identify excellent architectures. In this paper, we raise an interesting problem: can we properly measure the operation importance in DARTS through a training-free way, with avoiding the parameter-intensive bias? By devising an iterative and data-agnostic manner in utilizing ZEROS for NAS, our novel trial leads to a framework called training free differentiable architecture search (FreeDARTS)
arXiv Detail & Related papers (2021-06-22T04:40:34Z)
Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight [66.8543732597723]
Recent worksin neural architecture search (NAS) can aid transfer learning by establishing sufficient network search space. We propose a novel framework consisting of two modules, the neural architecturesearch module for architecture transfer and the neural weight search module for weight transfer. These two modules conduct search on thetarget task based on a reduced super-networks, so we only need to trainonce on the source task.
arXiv Detail & Related papers (2021-05-19T08:58:04Z)
Revisiting Neural Architecture Search [0.0]
We propose a novel approach to search for the complete neural network without much human effort and is a step closer towards AutoML-nirvana. Our method starts from a complete graph mapped to a neural network and searches for the connections and operations by balancing the exploration and exploitation of the search space.
arXiv Detail & Related papers (2020-10-12T13:57:30Z)
Powering One-shot Topological NAS with Stabilized Share-parameter Proxy [65.09967910722932]
One-shot NAS method has attracted much interest from the research community due to its remarkable training efficiency and capacity to discover high performance models. In this work, we try to enhance the one-shot NAS by exploring high-performing network architectures in our large-scale Topology Augmented Search Space. The proposed method achieves state-of-the-art performance under Multiply-Adds (MAdds) constraint on ImageNet.
arXiv Detail & Related papers (2020-05-21T08:18:55Z)
Progressive Automatic Design of Search Space for One-Shot Neural Architecture Search [15.017964136568061]
It has been observed that a model with higher one-shot model accuracy does not necessarily perform better when stand-alone trained. We propose Progressive Automatic Design of search space, named PAD-NAS. In this way, PAD-NAS can automatically design the operations for each layer and achieve a trade-off between search space quality and model diversity.
arXiv Detail & Related papers (2020-05-15T14:21:07Z)
MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning [71.90902837008278]
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL) In order to adapt to different task combinations, we disentangle the GP-MTL networks into single-task backbones. We also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures.
arXiv Detail & Related papers (2020-03-31T09:49:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.