Related papers: Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

URL: http://arxiv.org/abs/2505.22954v2
Date: Fri, 26 Sep 2025 16:36:03 GMT
Title: Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
Authors: Jenny Zhang, Shengran Hu, Cong Lu, Robert Lange, Jeff Clune,
Abstract summary: We introduce the Darwin G"odel Machine (DGM), a self-improving AI that repeatedly modifies itself in a provably beneficial manner.<n>Inspired by Darwinian evolution and open-endedness research, the DGM maintains an archive of generated coding agents.<n>It grows the archive by sampling an agent from it and using a foundation model to create a new, interesting, version of the sampled agent.
Score: 32.42616663576657
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Today's AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited by first-order improvements and the human design of a suitable search space. The G\"odel machine proposed a theoretical alternative: a self-improving AI that repeatedly modifies itself in a provably beneficial manner. Unfortunately, proving that most changes are net beneficial is impossible in practice. We introduce the Darwin G\"odel Machine (DGM), a self-improving system that iteratively modifies its own code (thereby also improving its ability to modify its own codebase) and empirically validates each change using coding benchmarks. Inspired by Darwinian evolution and open-endedness research, the DGM maintains an archive of generated coding agents. It grows the archive by sampling an agent from it and using a foundation model to create a new, interesting, version of the sampled agent. This open-ended exploration forms a growing tree of diverse, high-quality agents and allows the parallel exploration of many different paths through the search space. Empirically, the DGM automatically improves its coding capabilities (e.g., better code editing tools, long-context window management, peer-review mechanisms), increasing performance on SWE-bench from 20.0% to 50.0%, and on Polyglot from 14.2% to 30.7%. Furthermore, the DGM significantly outperforms baselines without self-improvement or open-ended exploration. All experiments were done with safety precautions (e.g., sandboxing, human oversight). The DGM is a significant step toward self-improving AI, capable of gathering its own stepping stones along paths that unfold into endless innovation.

Related papers

Will It Survive? Deciphering the Fate of AI-Generated Code in Open Source [3.6525095710982924]
A prevailing hypothesis suggests that code is "disposable", meaning it is merged quickly but discarded shortly thereafter.<n>We investigate this hypothesis through survival analysis of 201 open-source projects, tracking over 200,000 code remediation units authored by AI agents versus humans.
arXiv Detail & Related papers (2026-01-23T15:00:46Z)
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine [31.795598366502166]
We identify a mismatch between the agent's self-improvement potential and its coding benchmark performance.<n>Inspired by Huxley's concept of clade, we propose a metric ($mathrmCMP$) that aggregates the benchmark performances of the descendants of an agent.<n>We introduce the Huxley-G"odel Machine (HGM), which, by estimating $mathrmCMP$ and using it as guidance, searches the tree of self-modifications.
arXiv Detail & Related papers (2025-10-24T16:19:41Z)
From Agentification to Self-Evolving Agentic AI for Wireless Networks: Concepts, Approaches, and Future Research Directions [70.72279728350763]
Self-evolving agentic artificial intelligence (AI) offers a new paradigm for future wireless systems.<n>Unlike static AI models, self-evolving agents embed an autonomous evolution cycle that updates models, tools, and in response to environmental dynamics.<n>This paper presents a comprehensive overview of self-evolving agentic AI, highlighting its layered architecture, life cycle, and key techniques.
arXiv Detail & Related papers (2025-10-07T05:45:25Z)
SciMaster: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam? [51.112225746095746]
We introduce X-Master, a tool-augmented reasoning agent designed to emulate human researchers.<n>X-Masters sets a new state-of-the-art record on Humanity's Last Exam with a score of 32.1%.
arXiv Detail & Related papers (2025-07-07T17:50:52Z)
From Reproduction to Replication: Evaluating Research Agents with Progressive Code Masking [48.90371827091671]
AutoExperiment is a benchmark that evaluates AI agents' ability to implement and run machine learning experiments.<n>We evaluate state-of-the-art agents and find that performance degrades rapidly as $n$ increases.<n>Our findings highlight critical challenges in long-horizon code generation, context retrieval, and autonomous experiment execution.
arXiv Detail & Related papers (2025-06-24T15:39:20Z)
R&D-Agent: Automating Data-Driven AI Solution Building Through LLM-Powered Automated Research, Development, and Evolution [60.80016554091364]
R&D-Agent is a dual-agent framework for iterative exploration.<n>The Researcher agent uses performance feedback to generate ideas, while the Developer agent refines code based on error feedback.<n>R&D-Agent is evaluated on MLE-Bench and emerges as the top-performing machine learning engineering agent.
arXiv Detail & Related papers (2025-05-20T06:07:00Z)
The Shift from Writing to Pruning Software: A Bonsai-Inspired IDE for Reshaping AI Generated Code [11.149764135999437]
The rise of AI-driven coding assistants signals a fundamental shift in how software is built.<n>While AI coding assistants have been integrated into existing Integrated Development Environments, their full potential remains largely untapped.<n>We propose a new approach to IDEs, where AI is allowed to generate in its true, unconstrained form, free from traditional file structures.
arXiv Detail & Related papers (2025-03-04T17:57:26Z)
AI Generations: From AI 1.0 to AI 4.0 [3.4440023363051266]
This paper proposes that Artificial Intelligence (AI) progresses through several overlapping generations.<n>Each of these AI generations is driven by shifting priorities among algorithms, computing power, and data.<n>It explores the profound ethical, regulatory, and philosophical challenges that arise when artificial systems approach (or aspire to) human-like autonomy.
arXiv Detail & Related papers (2025-02-16T23:19:44Z)
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [117.94654815220404]
G"odel Agent is a self-evolving framework inspired by the G"odel machine.<n>G"odel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
arXiv Detail & Related papers (2024-10-06T10:49:40Z)
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery [14.465756130099091]
This paper presents the first comprehensive framework for fully automatic scientific discovery. We introduce The AI Scientist, which generates novel research ideas, writes code, executes experiments, visualizes results, and describes its findings. In principle, this process can be repeated to iteratively develop ideas in an open-ended fashion, acting like the human scientific community.
arXiv Detail & Related papers (2024-08-12T16:58:11Z)
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [55.485985317538194]
ProcTHOR is a framework for procedural generation of Embodied AI environments. We demonstrate state-of-the-art results across 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation.
arXiv Detail & Related papers (2022-06-14T17:09:35Z)
An Initial Look at Self-Reprogramming Artificial Intelligence [0.0]
We develop and experimentally validate the first fully self-reprogramming AI system. Applying AI-based computer code generation to AI itself, we implement an algorithm with the ability to continuously modify and rewrite its own neural network source code.
arXiv Detail & Related papers (2022-04-30T05:44:34Z)
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch [76.83052807776276]
We show that it is possible to automatically discover complete machine learning algorithms just using basic mathematical operations as building blocks. We demonstrate this by introducing a novel framework that significantly reduces human bias through a generic search space. We believe these preliminary successes in discovering machine learning algorithms from scratch indicate a promising new direction in the field.
arXiv Detail & Related papers (2020-03-06T19:00:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.