Related papers: Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep Learning

Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep Learning

URL: http://arxiv.org/abs/2004.00974v3
Date: Sat, 5 Sep 2020 21:24:04 GMT
Title: Deep-n-Cheap: An Automated Search Framework for Low Complexity Deep Learning
Authors: Sourya Dey, Saikrishna C. Kanala, Keith M. Chugg, Peter A. Beerel
Abstract summary: We present Deep-n-Cheap -- an open-source AutoML framework to search for deep learning models. Our framework is targeted for deployment on both benchmark and custom datasets. Deep-n-Cheap includes a user-customizable complexity penalty which trades off performance with training time or number of parameters.
Score: 3.479254848034425
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Deep-n-Cheap -- an open-source AutoML framework to search for deep learning models. This search includes both architecture and training hyperparameters, and supports convolutional neural networks and multi-layer perceptrons. Our framework is targeted for deployment on both benchmark and custom datasets, and as a result, offers a greater degree of search space customizability as compared to a more limited search over only pre-existing models from literature. We also introduce the technique of 'search transfer', which demonstrates the generalization capabilities of the models found by our framework to multiple datasets. Deep-n-Cheap includes a user-customizable complexity penalty which trades off performance with training time or number of parameters. Specifically, our framework results in models offering performance comparable to state-of-the-art while taking 1-2 orders of magnitude less time to train than models from other AutoML and model search frameworks. Additionally, this work investigates and develops various insights regarding the search process. In particular, we show the superiority of a greedy strategy and justify our choice of Bayesian optimization as the primary search methodology over random / grid search.

Related papers

EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models [64.18350535770357]
We propose an automatic pruning method for large vision-language models to enhance the efficiency of multimodal reasoning. Our approach only leverages a small number of samples to search for the desired pruning policy. We conduct extensive experiments on the ScienceQA, Vizwiz, MM-vet, and LLaVA-Bench datasets for the task of visual question answering.
arXiv Detail & Related papers (2025-03-19T16:07:04Z)
MGAS: Multi-Granularity Architecture Search for Trade-Off Between Model Effectiveness and Efficiency [10.641875933652647]
We introduce multi-granularity architecture search (MGAS) to discover both effective and efficient neural networks. We learn discretization functions specific to each granularity level to adaptively determine the unit remaining ratio according to the evolving architecture. Extensive experiments on CIFAR-10, CIFAR-100 and ImageNet demonstrate that MGAS outperforms other state-of-the-art methods in achieving a better trade-off between model performance and model size.
arXiv Detail & Related papers (2023-10-23T16:32:18Z)
POPNASv3: a Pareto-Optimal Neural Architecture Search Solution for Image and Time Series Classification [8.190723030003804]
This article presents the third version of a sequential model-based NAS algorithm targeting different hardware environments and multiple classification tasks. Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks. The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
arXiv Detail & Related papers (2022-12-13T17:14:14Z)
Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR [1.2507285499419876]
We present an automatic partitioner that identifies efficient combinations for many model architectures and accelerator systems. Our key findings are that a Monte Carlo Tree Search-based partitioner leveraging partition-specific compiler analysis directly into the search and guided goals matches expert-level strategies for various models.
arXiv Detail & Related papers (2022-10-07T17:46:46Z)
AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures. We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS. Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z)
Redefining Neural Architecture Search of Heterogeneous Multi-Network Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models. We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z)
Efficient Data-specific Model Search for Collaborative Filtering [56.60519991956558]
Collaborative filtering (CF) is a fundamental approach for recommender systems. In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model. Key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction and prediction function.
arXiv Detail & Related papers (2021-06-14T14:30:32Z)
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking [97.60915598958968]
We propose a one-shot neural ensemble architecture search (NEAS) solution that addresses the two challenges. For the first challenge, we introduce a novel diversity-based metric to guide search space shrinking. For the second challenge, we enable a new search dimension to learn layer sharing among different models for efficiency purposes.
arXiv Detail & Related papers (2021-04-01T16:29:49Z)
Efficient Model Performance Estimation via Feature Histories [27.008927077173553]
An important step in the task of neural network design is the evaluation of a model's performance. In this work, we use the evolution history of features of a network during the early stages of training to build a proxy classifier. We show that our method can be combined with multiple search algorithms to find better solutions to a wide range of tasks.
arXiv Detail & Related papers (2021-03-07T20:41:57Z)
NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG) Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset. We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z)
AutoPose: Searching Multi-Scale Branch Aggregation for Pose Estimation [96.29533512606078]
We present AutoPose, a novel neural architecture search(NAS) framework. It is capable of automatically discovering multiple parallel branches of cross-scale connections towards accurate and high-resolution 2D human pose estimation.
arXiv Detail & Related papers (2020-08-16T22:27:43Z)
AutoSTR: Efficient Backbone Search for Scene Text Recognition [80.7290173000068]
Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes. We propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance. Experiments demonstrate that, by searching data-dependent backbones, AutoSTR can outperform the state-of-the-art approaches on standard benchmarks.
arXiv Detail & Related papers (2020-03-14T06:51:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.