LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization
- URL: http://arxiv.org/abs/2306.01102v7
- Date: Wed, 10 Apr 2024 13:18:37 GMT
- Title: LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization
- Authors: Muhammad U. Nasir, Sam Earle, Julian Togelius, Steven James, Christopher Cleghorn,
- Abstract summary: Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks.
We propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks.
By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce textttLLMatic, a Neural Architecture Search (NAS) algorithm.
- Score: 4.951599300340954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce \texttt{LLMatic}, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, \texttt{LLMatic} uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and high-performing networks. We test \texttt{LLMatic} on the CIFAR-10 and NAS-bench-201 benchmarks, demonstrating that it can produce competitive networks while evaluating just $2,000$ candidates, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark. The open-sourced code is available in \url{https://github.com/umair-nasir14/LLMatic}.
Related papers
- Search for Efficient Large Language Models [52.98684997131108]
Large Language Models (LLMs) have long held sway in the realms of artificial intelligence research.
Weight pruning, quantization, and distillation have been embraced to compress LLMs, targeting memory reduction and inference acceleration.
Most model compression techniques concentrate on weight optimization, overlooking the exploration of optimal architectures.
arXiv Detail & Related papers (2024-09-25T21:32:12Z) - Large Language Model Assisted Adversarial Robustness Neural Architecture Search [14.122460940115069]
This paper proposes a novel LLM-assisted (LLMO) to address adversarial neural architecture search (ARNAS)
We design prompt using the standard CRISPE framework (i.e., Capacity and Role, Insight, Statement, Personality, and Experiment)
We iteratively refine the prompt, and the responses from Gemini are adapted as solutions to ARNAS instances.
arXiv Detail & Related papers (2024-06-08T10:45:07Z) - LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models [3.4070166891274263]
Large language models (LLMs) solve natural language processing, complex reasoning, sentiment analysis and other tasks.
These abilities come with very high memory and computational costs which precludes the use of LLMs on most hardware platforms.
We propose an effective method of finding Pareto-optimal network architectures based on LLaMA2-7B using one-shot NAS.
We show that, for certain standard benchmark tasks, the pre-trained LLaMA2-7B network is unnecessarily large and complex.
arXiv Detail & Related papers (2024-05-28T17:20:44Z) - Large Language Models (LLMs) Assisted Wireless Network Deployment in Urban Settings [0.21847754147782888]
Large Language Models (LLMs) have revolutionized language understanding and human-like text generation.
This paper explores new techniques to harness the power of LLMs for 6G (6th Generation) wireless communication technologies.
We introduce a novel Reinforcement Learning (RL) based framework that leverages LLMs for network deployment in wireless communications.
arXiv Detail & Related papers (2024-05-22T05:19:51Z) - InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models [56.723509505549536]
InfiBench is the first large-scale freeform question-answering (QA) benchmark for code to our knowledge.
It comprises 234 carefully selected high-quality Stack Overflow questions that span across 15 programming languages.
We conduct a systematic evaluation for over 100 latest code LLMs on InfiBench, leading to a series of novel and insightful findings.
arXiv Detail & Related papers (2024-03-11T02:06:30Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural
Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints.
We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z) - CATCH: Context-based Meta Reinforcement Learning for Transferrable
Architecture Search [102.67142711824748]
CATCH is a novel Context-bAsed meTa reinforcement learning algorithm for transferrable arChitecture searcH.
The combination of meta-learning and RL allows CATCH to efficiently adapt to new tasks while being agnostic to search spaces.
It is also capable of handling cross-domain architecture search as competitive networks on ImageNet, COCO, and Cityscapes are identified.
arXiv Detail & Related papers (2020-07-18T09:35:53Z) - Local Search is a Remarkably Strong Baseline for Neural Architecture
Search [0.0]
We consider, for the first time, a simple Local Search (LS) algorithm for Neural Architecture Search (NAS)
We release two benchmark datasets, named MacroNAS-C10 and MacroNAS-C100, containing 200K saved network evaluations for two established image classification tasks.
arXiv Detail & Related papers (2020-04-20T00:08:34Z) - NAS-Count: Counting-by-Density with Neural Architecture Search [74.92941571724525]
We automate the design of counting models with Neural Architecture Search (NAS)
We introduce an end-to-end searched encoder-decoder architecture, Automatic Multi-Scale Network (AMSNet)
arXiv Detail & Related papers (2020-02-29T09:18:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.