Related papers: SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation

SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation

URL: http://arxiv.org/abs/2407.20062v1
Date: Mon, 29 Jul 2024 14:48:34 GMT
Title: SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation
Authors: Chakkrit Termritthikun, Ayaz Umer, Suwichaya Suwanwimolkul, Feng Xia, Ivan Lee,
Abstract summary: Recent advancements in deep convolutional neural networks have significantly improved the performance of saliency prediction. We propose a new Neural Architecture Search framework for saliency prediction with two contributions. By utilizing Self-KD, SalNAS outperforms other state-of-the-art saliency prediction models in most evaluation rubrics.
Score: 7.625269122161064
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advancements in deep convolutional neural networks have significantly improved the performance of saliency prediction. However, the manual configuration of the neural network architectures requires domain knowledge expertise and can still be time-consuming and error-prone. To solve this, we propose a new Neural Architecture Search (NAS) framework for saliency prediction with two contributions. Firstly, a supernet for saliency prediction is built with a weight-sharing network containing all candidate architectures, by integrating a dynamic convolution into the encoder-decoder in the supernet, termed SalNAS. Secondly, despite the fact that SalNAS is highly efficient (20.98 million parameters), it can suffer from the lack of generalization. To solve this, we propose a self-knowledge distillation approach, termed Self-KD, that trains the student SalNAS with the weighted average information between the ground truth and the prediction from the teacher model. The teacher model, while sharing the same architecture, contains the best-performing weights chosen by cross-validation. Self-KD can generalize well without the need to compute the gradient in the teacher model, enabling an efficient training system. By utilizing Self-KD, SalNAS outperforms other state-of-the-art saliency prediction models in most evaluation rubrics across seven benchmark datasets while being a lightweight model. The code will be available at https://github.com/chakkritte/SalNAS

Related papers

SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space [6.2241272327831485]
We propose a framework that simultaneously optimize both the architecture and the weights of a neural network.<n>Our framework first trains a universal multi-scale autoencoder that embeds both architectural and parametric information into a continuous latent space.<n>Given a dataset, we then randomly initialize a point in the embedding space and update it via gradient descent to obtain the optimal neural network.
arXiv Detail & Related papers (2025-06-09T22:22:37Z)
Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights [0.0]
We propose a training-free proxy for image classification accuracy based on Fisher Information. Our proxy achieves state-of-the-art results on three public datasets and in two search spaces.
arXiv Detail & Related papers (2025-02-07T14:48:28Z)
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts [55.470959564665705]
Weight-sharing supernets are crucial for performance estimation in cutting-edge neural search frameworks. The proposed method attains state-of-the-art (SoTA) performance in NAS for fast machine translation models. It excels in NAS for building memory-efficient task-agnostic BERT models.
arXiv Detail & Related papers (2023-06-08T00:35:36Z)
Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets [55.2118691522524]
Distillation-aware Neural Architecture Search (DaNAS) aims to search for an optimal student architecture. We propose a distillation-aware meta accuracy prediction model, DaSS (Distillation-aware Student Search), which can predict a given architecture's final performances on a dataset.
arXiv Detail & Related papers (2023-05-26T14:00:35Z)
GP-NAS-ensemble: a model for NAS Performance Prediction [6.785608131249699]
GP-NAS-ensemble is proposed to predict the performance of a neural network architecture with a small training dataset. Our method ranks second in the CVPR2022 second lightweight NAS challenge performance prediction track.
arXiv Detail & Related papers (2023-01-23T00:17:52Z)
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically. Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z)
PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search [34.06028035262884]
We propose a novel evolutionary-based NAS strategy, Predictor-assisted E-NAS (PRE-NAS) PRE-NAS leverages new evolutionary search strategies and integrates high-fidelity weight inheritance over generations. Experiments on NAS-Bench-201 and DARTS search spaces show that PRE-NAS can outperform state-of-the-art NAS methods.
arXiv Detail & Related papers (2022-04-27T06:40:39Z)
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS) TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space. We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z)
Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor. We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space. Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z)
PEng4NN: An Accurate Performance Estimation Engine for Efficient Automated Neural Network Architecture Search [0.0]
Neural network (NN) models are increasingly used in scientific simulations, AI, and other high performance computing fields. NAS attempts to find well-performing NN models for specialized datsets, where performance is measured by key metrics that capture the NN capabilities. We propose a performance estimation strategy that reduces the resources for training NNs and increases NAS throughput without jeopardizing accuracy.
arXiv Detail & Related papers (2021-01-11T20:49:55Z)
Direct Federated Neural Architecture Search [0.0]
We present an effective approach for direct federated NAS which is hardware agnostic, computationally lightweight, and a one-stage method to search for ready-to-deploy neural network models. Our results show an order of magnitude reduction in resource consumption while edging out prior art in accuracy.
arXiv Detail & Related papers (2020-10-13T08:11:35Z)
FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking. We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints. FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.