Differentiable NAS Framework and Application to Ads CTR Prediction
- URL: http://arxiv.org/abs/2110.14812v1
- Date: Mon, 25 Oct 2021 05:46:27 GMT
- Title: Differentiable NAS Framework and Application to Ads CTR Prediction
- Authors: Ravi Krishna, Aravind Kalaiah, Bichen Wu, Maxim Naumov, Dheevatsa
Mudigere, Misha Smelyanskiy, Kurt Keutzer
- Abstract summary: We implement an inference and modular framework for Differentiable Neural Architecture Search (DNAS)
We apply DNAS to the problem of ads click-through rate (CTR) prediction, arguably the highest-value and most worked on AI problem at hyperscalers today.
We develop and tailor novel search spaces to a Deep Learning Recommendation Model (DLRM) backbone for CTR prediction, and report state-of-the-art results on the Criteo Kaggle CTR prediction dataset.
- Score: 30.74403362212425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural architecture search (NAS) methods aim to automatically find the
optimal deep neural network (DNN) architecture as measured by a given objective
function, typically some combination of task accuracy and inference efficiency.
For many areas, such as computer vision and natural language processing, this
is a critical, yet still time consuming process. New NAS methods have recently
made progress in improving the efficiency of this process. We implement an
extensible and modular framework for Differentiable Neural Architecture Search
(DNAS) to help solve this problem. We include an overview of the major
components of our codebase and how they interact, as well as a section on
implementing extensions to it (including a sample), in order to help users
adopt our framework for their applications across different categories of deep
learning models. To assess the capabilities of our methodology and
implementation, we apply DNAS to the problem of ads click-through rate (CTR)
prediction, arguably the highest-value and most worked on AI problem at
hyperscalers today. We develop and tailor novel search spaces to a Deep
Learning Recommendation Model (DLRM) backbone for CTR prediction, and report
state-of-the-art results on the Criteo Kaggle CTR prediction dataset.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit
CNNs [53.82853297675979]
1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices.
One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS.
We introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs.
arXiv Detail & Related papers (2023-06-27T11:28:29Z) - Neural Architecture Search for Speech Emotion Recognition [72.1966266171951]
We propose to apply neural architecture search (NAS) techniques to automatically configure the SER models.
We show that NAS can improve SER performance (54.89% to 56.28%) while maintaining model parameter sizes.
arXiv Detail & Related papers (2022-03-31T10:16:10Z) - Neural Architecture Search for Dense Prediction Tasks in Computer Vision [74.9839082859151]
Deep learning has led to a rising demand for neural network architecture engineering.
neural architecture search (NAS) aims at automatically designing neural network architectures in a data-driven manner rather than manually.
NAS has become applicable to a much wider range of problems in computer vision.
arXiv Detail & Related papers (2022-02-15T08:06:50Z) - Evolutionary Neural Architecture Search Supporting Approximate
Multipliers [0.5414308305392761]
We propose a multi-objective NAS method based on Cartesian genetic programming for evolving convolutional neural networks (CNN)
The most suitable approximate multipliers are automatically selected from a library of approximate multipliers.
Evolved CNNs are compared with common human-created CNNs of a similar complexity on the CIFAR-10 benchmark problem.
arXiv Detail & Related papers (2021-01-28T09:26:03Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural
Network Synthesis [53.106414896248246]
We present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge.
Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application.
arXiv Detail & Related papers (2020-09-28T01:48:45Z) - NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language
Processing [12.02718579660613]
We step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP)
We have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it.
We have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation.
arXiv Detail & Related papers (2020-06-12T12:19:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.