NNGPT: Rethinking AutoML with Large Language Models
- URL: http://arxiv.org/abs/2511.20333v1
- Date: Tue, 25 Nov 2025 14:10:44 GMT
- Title: NNGPT: Rethinking AutoML with Large Language Models
- Authors: Roman Kochnev, Waleed Khalid, Tolgay Atinc Uzun, Xi Zhang, Yashkumar Sanjaybhai Dhameliya, Furui Qin, Chandini Vysyaraju, Raghuvir Duvvuri, Avi Goyal, Dmitry Ignatov, Radu Timofte,
- Abstract summary: NNGPT is an open-source framework that turns a large language model (LLM) into a self-improving AutoML engine for neural network development.<n>It integrates within one unified workflow five synergistic LLM-based pipelines: zero-shot architecture synthesis, hyper parameter optimization, code-aware accuracy/early-stop prediction, and reinforcement learning.<n>The system has already generated over 5K validated models, proving NNGPT as an autonomous AutoML engine.
- Score: 36.90850535125572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building self-improving AI systems remains a fundamental challenge in the AI domain. We present NNGPT, an open-source framework that turns a large language model (LLM) into a self-improving AutoML engine for neural network development, primarily for computer vision. Unlike previous frameworks, NNGPT extends the dataset of neural networks by generating new models, enabling continuous fine-tuning of LLMs based on closed-loop system of generation, assessment, and self-improvement. It integrates within one unified workflow five synergistic LLM-based pipelines: zero-shot architecture synthesis, hyperparameter optimization (HPO), code-aware accuracy/early-stop prediction, retrieval-augmented synthesis of scope-closed PyTorch blocks (NN-RAG), and reinforcement learning. Built on the LEMUR dataset as an audited corpus with reproducible metrics, NNGPT emits from a single prompt and validates network architecture, preprocessing code, and hyperparameters, executes them end-to-end, and learns from result. The PyTorch adapter makes NNGPT framework-agnostic, enabling strong performance: NN-RAG achieves 73% executability on 1,289 targets, 3-shot prompting boosts accuracy on common datasets, and hash-based deduplication saves hundreds of runs. One-shot prediction matches search-based AutoML, reducing the need for numerous trials. HPO on LEMUR achieves RMSE 0.60, outperforming Optuna (0.64), while the code-aware predictor reaches RMSE 0.14 with Pearson r=0.78. The system has already generated over 5K validated models, proving NNGPT as an autonomous AutoML engine. Upon acceptance, the code, prompts, and checkpoints will be released for public access to enable reproducibility and facilitate community usage.
Related papers
- From Memorization to Creativity: LLM as a Designer of Novel Neural-Architectures [48.83701310501069]
Large language models (LLMs) excel in program synthesis, yet their ability to autonomously navigate neural architecture design-balancing reliability, performance, and structural novelty--remains underexplored.<n>We address this by placing a code-oriented LLM within a closed-loop synthesis framework, analyzing its evolution over 22 supervised fine-tuning cycles.
arXiv Detail & Related papers (2026-01-06T13:20:28Z) - LLM as a Neural Architect: Controlled Generation of Image Captioning Models Under Strict API Contracts [48.83701310501069]
We present NN-Caption, an LLM-guided neural architecture search pipeline.<n>It generates runnable image-captioning models by composing CNN encoders from LEMUR's classification backbones.<n>This work presents a pipeline that integrates prompt-based code generation with automatic evaluation.
arXiv Detail & Related papers (2025-12-07T10:47:28Z) - LLM-AR: LLM-powered Automated Reasoning Framework [0.0]
Large language models (LLMs) can already identify patterns and reason effectively, yet their variable accuracy hampers adoption in high-stakes decision-making applications.<n>We introduce LLM-AR, a pipeline inspired by neural-symbolic systems that distils LLM-generateds into probabilistic rules executed by the ProbLog automated-reasoning engine.<n>On unseen folds, LLM-AR achieves 59.5% precision and 8.7% recall, 5.9x the random baseline precision, while exposing every decision path for human inspection.
arXiv Detail & Related papers (2025-10-24T21:36:18Z) - QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation [51.393569044134445]
Large language models (LLMs) trained via reinforcement learning with verifiable reward (RLVR) have achieved breakthroughs on tasks with explicit, automatable verification.<n> Extending RLVR to automatically generating hardware description languages (HDLs) like Verilog from natural-language (NL) specifications, however, poses three key challenges.<n>We introduce CodeV-R1, an RLVR framework for training Verilog generation LLMs.
arXiv Detail & Related papers (2025-05-30T03:51:06Z) - LLM-Based Emulation of the Radio Resource Control Layer: Towards AI-Native RAN Protocols [28.04609776570199]
Large AI Models (LAMs) are key enablers of the AI-Native Air Interface (AI-AI)<n>This paper presents the first standards-compliant emulation of the Radio Resource Control layer using a decoder-only LAM.<n>Results demonstrate that LAMs, when augmented with protocol-aware reasoning, can directly orchestrate control-plane procedures.
arXiv Detail & Related papers (2025-05-22T15:55:56Z) - GNN-Suite: a Graph Neural Network Benchmarking Framework for Biomedical Informatics [0.0]
We present GNN-Suite, a framework for constructing and benchmarking Graph Neural Network (GNN) architectures in computational biology.<n>We demonstrate its utility in identifying cancer-driver genes by constructing molecular networks from protein-protein interaction (PPI) data.<n>Our results show that a common framework for implementing and evaluating GNN architectures aids in identifying not only the best model but also the most effective means of incorporating complementary data.
arXiv Detail & Related papers (2025-05-15T21:14:30Z) - LEMUR Neural Network Dataset: Towards Seamless AutoML [35.57280723615144]
We introduce LEMUR, an open-source dataset and framework that provides a large collection of PyTorch-based neural networks.<n>Each model follows a unified template, with configurations and results stored in a structured database to ensure consistency.<n>LEMUR aims to accelerate AutoML research, enable fair benchmarking, and reduce barriers to large-scale neural network research.
arXiv Detail & Related papers (2025-04-14T09:08:00Z) - ProofAug: Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis [50.020850767257095]
We propose ProofAug, a procedure that equips LLMs with automation methods at various granularities.<n>Our method is validated on the miniF2F benchmark using the open-source deep-math-7b-base model and the Isabelle proof assistant.<n>We also implement a Lean 4 version of ProofAug that can improve the pass@1 performance of Kimina-Prover-seek-Distill-1.5B from 44.3% to 50.4%.
arXiv Detail & Related papers (2025-01-30T12:37:06Z) - Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and
Robust AutoDL [53.40030379661183]
Auto-PyTorch is a framework to enable fully automated deep learning (AutoDL)
It combines multi-fidelity optimization with portfolio construction for warmstarting and ensembling of deep neural networks (DNNs)
We show that Auto-PyTorch performs better than several state-of-the-art competitors on average.
arXiv Detail & Related papers (2020-06-24T15:15:17Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.