Related papers: AIDE: AI-Driven Exploration in the Space of Code

AIDE: AI-Driven Exploration in the Space of Code

URL: http://arxiv.org/abs/2502.13138v1
Date: Tue, 18 Feb 2025 18:57:21 GMT
Title: AIDE: AI-Driven Exploration in the Space of Code
Authors: Zhengyao Jiang, Dominik Schmidt, Dhruv Srikanth, Dixing Xu, Ian Kaplan, Deniss Jacenko, Yuxiang Wu,
Abstract summary: We introduce AI-Driven Exploration (AIDE), a machine learning engineering agent powered by large language models (LLMs)<n>AIDE frames machine learning engineering as a code optimization problem, and formulates trial-and-error as a tree search in the space of potential solutions.<n>By strategically reusing and refining promising solutions, AIDE effectively trades computational resources for enhanced performance.
Score: 6.401493599308353
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning, the foundation of modern artificial intelligence, has driven innovations that have fundamentally transformed the world. Yet, behind advancements lies a complex and often tedious process requiring labor and compute intensive iteration and experimentation. Engineers and scientists developing machine learning models spend much of their time on trial-and-error tasks instead of conceptualizing innovative solutions or research hypotheses. To address this challenge, we introduce AI-Driven Exploration (AIDE), a machine learning engineering agent powered by large language models (LLMs). AIDE frames machine learning engineering as a code optimization problem, and formulates trial-and-error as a tree search in the space of potential solutions. By strategically reusing and refining promising solutions, AIDE effectively trades computational resources for enhanced performance, achieving state-of-the-art results on multiple machine learning engineering benchmarks, including our Kaggle evaluations, OpenAI MLE-Bench and METRs RE-Bench.

Related papers

Personalized Artificial General Intelligence (AGI) via Neuroscience-Inspired Continuous Learning Systems [3.764721243654025]
Current approaches largely depend on expanding model parameters, which improves task-specific performance but falls short in enabling continuous, adaptable, and generalized learning. This paper reviews the state of continual learning and neuroscience-inspired AI, and proposes a novel architecture for Personalized AGI that integrates brain-like learning mechanisms for edge deployment. Building on these insights, we outline an AI architecture that features complementary fast-and-slow learning modules, synaptic self-optimization, and memory-efficient model updates to support on-device lifelong adaptation.
arXiv Detail & Related papers (2025-04-27T16:10:17Z)
Speeding up design and making to reduce time-to-project and time-to-market: an AI-Enhanced approach in engineering education [0.0]
This paper explores the integration of AI tools, such as ChatGPT and GitHub Copilot, in the Software Architecture for Embedded Systems course. Results demon-started enhanced problem-solving, faster development, and more sophisticated outcomes, with AI augmenting but not replacing human decision-making.
arXiv Detail & Related papers (2025-03-20T16:32:13Z)
MLGym: A New Framework and Benchmark for Advancing AI Research Agents [51.9387884953294]
We introduce Meta MLGym and MLGym-Bench, a new framework and benchmark for evaluating and developing large language models on AI research tasks. This is the first Gym environment for machine learning (ML) tasks, enabling research on reinforcement learning (RL) algorithms for training such agents. We evaluate a number of frontier large language models (LLMs) on our benchmarks such as Claude-3.5-Sonnet, Llama-3.1 405B, GPT-4o, o1-preview, and Gemini-1.5 Pro.
arXiv Detail & Related papers (2025-02-20T12:28:23Z)
Retrieval-Augmented Instruction Tuning for Automated Process Engineering Calculations : A Tool-Chaining Problem-Solving Framework with Attributable Reflection [0.0]
We introduce a novel autonomous agent framework leveraging Retrieval-Augmented Instruction-Tuning (RAIT) to enhance open, customizable small code language models (SLMs) By combining instruction tuned code SLMs with Retrieval-Augmented Code Generation (RACG) using external tools, the agent generates, debugs, and optimize code from natural language specifications. Our approach addresses the limitations of the current lack of a foundational AI model for specialized process engineering tasks and offers benefits of explainability, knowledge editing, and cost-effectiveness.
arXiv Detail & Related papers (2024-08-28T15:33:47Z)
AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models [2.7624021966289605]
Theory of Inventive Problem Solving is widely applied for systematic innovation. The complexity of TRIZ resources and concepts, coupled with its reliance on users' knowledge, experience, and reasoning capabilities, limits its practicality. This paper proposes AutoTRIZ, an artificial ideation tool that uses LLMs to automate and enhance the TRIZ methodology.
arXiv Detail & Related papers (2024-03-13T02:53:36Z)
Machine Learning Insides OptVerse AI Solver: Design Principles and Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z)
The Future of Fundamental Science Led by Generative Closed-Loop Artificial Intelligence [67.70415658080121]
Recent advances in machine learning and AI are disrupting technological innovation, product development, and society as a whole. AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access. Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery.
arXiv Detail & Related papers (2023-07-09T21:16:56Z)
OpenAGI: When LLM Meets Domain Experts [51.86179657467822]
Human Intelligence (HI) excels at combining basic skills to solve complex tasks. This capability is vital for Artificial Intelligence (AI) and should be embedded in comprehensive AI Agents. We introduce OpenAGI, an open-source platform designed for solving multi-step, real-world tasks.
arXiv Detail & Related papers (2023-04-10T03:55:35Z)
Selected Trends in Artificial Intelligence for Space Applications [69.3474006357492]
This chapter focuses on differentiable intelligence and on-board machine learning. We discuss a few selected projects originating from the European Space Agency's (ESA) Advanced Concepts Team (ACT)
arXiv Detail & Related papers (2022-12-10T07:49:50Z)
Enabling Automated Machine Learning for Model-Driven AI Engineering [60.09869520679979]
We propose a novel approach to enable Model-Driven Software Engineering and Model-Driven AI Engineering. In particular, we support Automated ML, thus assisting software engineers without deep AI knowledge in developing AI-intensive systems.
arXiv Detail & Related papers (2022-03-06T10:12:56Z)
Flashlight: Enabling Innovation in Tools for Machine Learning [50.63188263773778]
We introduce Flashlight, an open-source library built to spur innovation in machine learning tools and systems. We see Flashlight as a tool enabling research that can benefit widely used libraries downstream and bring machine learning and systems researchers closer together.
arXiv Detail & Related papers (2022-01-29T01:03:29Z)
Agility in Software 2.0 -- Notebook Interfaces and MLOps with Buttresses and Rebars [9.327920030279586]
We discuss two contemporary development phenomena that are fundamental in machine learning development. First, we present a solution that can remedy some of the intrinsic weaknesses of working in notebooks by supporting easy transitions to integrated development environments. Second, we propose reinforced engineering of AI systems by introducing metaphorical buttresses and rebars in the MLOps context.
arXiv Detail & Related papers (2021-11-28T13:40:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.