Related papers: Can Large Language Models Invent Algorithms to Improve Themselves?

Can Large Language Models Invent Algorithms to Improve Themselves?

URL: http://arxiv.org/abs/2410.15639v2
Date: Tue, 22 Oct 2024 03:14:46 GMT
Title: Can Large Language Models Invent Algorithms to Improve Themselves?
Authors: Yoichi Ishibashi, Taro Yano, Masafumi Oyamada,
Abstract summary: Large Language Models (LLMs) have shown remarkable performance improvements and are rapidly gaining adoption in industry. We propose the Self-Developing framework, which enables LLMs to autonomously generate and learn model-improvement algorithms. In mathematical reasoning tasks, Self-Developing not only creates models that surpass the seed model but also consistently outperforms models created using human-designed algorithms.
Score: 3.6117068575553595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have shown remarkable performance improvements and are rapidly gaining adoption in industry. However, the methods for improving LLMs are still designed by humans, which restricts the invention of new model-improving algorithms to human expertise and imagination. To address this, we propose the Self-Developing framework, which enables LLMs to autonomously generate and learn model-improvement algorithms. In this framework, the seed model generates, applies, and learns model-improving algorithms, continuously improving both the seed model and the algorithms themselves. In mathematical reasoning tasks, Self-Developing not only creates models that surpass the seed model but also consistently outperforms models created using human-designed algorithms. Additionally, these LLM-discovered algorithms demonstrate strong effectiveness, including transferability to out-of-domain models.

Related papers

Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning [12.037588566211348]
We propose to refine large language models (LLMs) through reinforcement learning (RL) fine-tuning. Our experiments show that combining RL and evolutionary search improves efficiency of improved algorithms.
arXiv Detail & Related papers (2025-04-07T14:14:15Z)
LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection [0.0]
In machine learning, Neural Architecture Search (NAS) requires domain knowledge of model design and a large amount of trial-and-error to achieve promising performance. The Large Language Model (LLM)-Guided Evolution (GE) framework transformed this approach by incorporating LLMs to directly modify model source code for image classification algorithms on CIFAR data. We show that LLM-GE produced variants with significant performance improvements, such as an increase in Mean Average Precision from 92.5% to 94.5%.
arXiv Detail & Related papers (2025-04-03T05:06:06Z)
RL-finetuning LLMs from on- and off-policy data with a single algorithm [53.70731390624718]
We introduce a novel reinforcement learning algorithm (AGRO) for fine-tuning large-language models. AGRO leverages the concept of generation consistency, which states that the optimal policy satisfies the notion of consistency across any possible generation of the model. We derive algorithms that find optimal solutions via the sample-based policy gradient and provide theoretical guarantees on their convergence.
arXiv Detail & Related papers (2025-03-25T12:52:38Z)
MLGym: A New Framework and Benchmark for Advancing AI Research Agents [51.9387884953294]
We introduce Meta MLGym and MLGym-Bench, a new framework and benchmark for evaluating and developing large language models on AI research tasks. This is the first Gym environment for machine learning (ML) tasks, enabling research on reinforcement learning (RL) algorithms for training such agents. We evaluate a number of frontier large language models (LLMs) on our benchmarks such as Claude-3.5-Sonnet, Llama-3.1 405B, GPT-4o, o1-preview, and Gemini-1.5 Pro.
arXiv Detail & Related papers (2025-02-20T12:28:23Z)
EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty. We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications. Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z)
On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks. We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly. In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z)
On the Design and Analysis of LLM-Based Algorithms [74.7126776018275]
Large language models (LLMs) are used as sub-routines in algorithms. LLMs have achieved remarkable empirical success. Our proposed framework holds promise for advancing LLM-based algorithms.
arXiv Detail & Related papers (2024-07-20T07:39:07Z)
Large Language Models as Surrogate Models in Evolutionary Algorithms: A Preliminary Study [5.6787965501364335]
Surrogate-assisted selection is a core step in evolutionary algorithms to solve expensive optimization problems. Traditionally, this has relied on conventional machine learning methods, leveraging historical evaluated evaluations to predict the performance of new solutions. In this work, we propose a novel surrogate model based purely on LLM inference capabilities, eliminating the need for training.
arXiv Detail & Related papers (2024-06-15T15:54:00Z)
LLaMEA: A Large Language Model Evolutionary Algorithm for Automatically Generating Metaheuristics [0.023020018305241332]
This paper introduces a novel Large Language Model Evolutionary Algorithm (LLaMEA) framework. Given a set of criteria and a task definition (the search space), LLaMEA iteratively generates, mutates and selects algorithms. We show how this framework can be used to generate novel black-box metaheuristic optimization algorithms automatically.
arXiv Detail & Related papers (2024-05-30T15:10:59Z)
Designing Network Algorithms via Large Language Models [11.055072300500104]
We introduce NADA, the first framework to autonomously design network algorithms by leveraging the generative capabilities of large language models (LLMs) We demonstrate that NADA produces novel ABR algorithms that consistently outperform the original algorithm in diverse network environments, including broadband, satellite, 4G, and 5G.
arXiv Detail & Related papers (2024-04-02T03:43:55Z)
Evolutionary Optimization of Model Merging Recipes [21.41838972039297]
We present a novel application of evolutionary algorithms to automate the creation of powerful foundation models. We propose an evolutionary approach that overcomes the limitation by automatically discovering effective combinations of diverse open-source models. This work contributes new state-of-the-art models back to the open-source community, and also introduces a new paradigm for automated model composition.
arXiv Detail & Related papers (2024-03-19T22:56:53Z)
Algorithm Evolution Using Large Language Model [18.03090066194074]
We propose a novel approach called Evolution Algorithm using Large Language Model (AEL) AEL does algorithm-level evolution without model training. Human effort and requirements for domain knowledge can be significantly reduced.
arXiv Detail & Related papers (2023-11-26T09:38:44Z)
STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning [82.03481509373037]
Recently, model-based reinforcement learning algorithms have demonstrated remarkable efficacy in visual input environments. We introduce Transformer-based wORld Model (STORM), an efficient world model architecture that combines strong modeling and generation capabilities. Storm achieves a mean human performance of $126.7%$ on the Atari $100$k benchmark, setting a new record among state-of-the-art methods.
arXiv Detail & Related papers (2023-10-14T16:42:02Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
Augmenting Interpretable Models with LLMs during Training [73.40079895413861]
We propose Augmented Interpretable Models (Aug-imodels) to build efficient and interpretable models. Aug-imodels use LLMs during fitting but not during inference, allowing complete transparency. We explore two instantiations of Aug-imodels in natural-language processing: (i) Aug-GAM, which augments a generalized additive model with decoupled embeddings from an LLM and (ii) Aug-Tree, which augments a decision tree with LLM feature expansions.
arXiv Detail & Related papers (2022-09-23T18:36:01Z)
Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z)
Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning [93.1435980666675]
We show how optimistic exploration can be easily combined with state-of-the-art reinforcement learning algorithms. Our experiments demonstrate that optimistic exploration significantly speeds-up learning when there are penalties on actions.
arXiv Detail & Related papers (2020-06-15T18:37:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.