A Multi-AI Agent System for Autonomous Optimization of Agentic AI Solutions via Iterative Refinement and LLM-Driven Feedback Loops
- URL: http://arxiv.org/abs/2412.17149v1
- Date: Sun, 22 Dec 2024 20:08:04 GMT
- Title: A Multi-AI Agent System for Autonomous Optimization of Agentic AI Solutions via Iterative Refinement and LLM-Driven Feedback Loops
- Authors: Kamer Ali Yuksel, Hassan Sawaf,
- Abstract summary: This paper introduces a framework for autonomously optimizing Agentic AI solutions across industries.<n>The framework achieves optimal performance without human input by autonomously generating and testing hypotheses.<n>Case studies show significant improvements in output quality, relevance, and actionability.
- Score: 3.729242965449096
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Agentic AI systems use specialized agents to handle tasks within complex workflows, enabling automation and efficiency. However, optimizing these systems often requires labor-intensive, manual adjustments to refine roles, tasks, and interactions. This paper introduces a framework for autonomously optimizing Agentic AI solutions across industries, such as NLP-driven enterprise applications. The system employs agents for Refinement, Execution, Evaluation, Modification, and Documentation, leveraging iterative feedback loops powered by an LLM (Llama 3.2-3B). The framework achieves optimal performance without human input by autonomously generating and testing hypotheses to improve system configurations. This approach enhances scalability and adaptability, offering a robust solution for real-world applications in dynamic environments. Case studies across diverse domains illustrate the transformative impact of this framework, showcasing significant improvements in output quality, relevance, and actionability. All data for these case studies, including original and evolved agent codes, along with their outputs, are here: https://anonymous.4open.science/r/evolver-1D11/
Related papers
- Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance.
We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z) - DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development.
We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents.
We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z) - Improving Retrospective Language Agents via Joint Policy Gradient Optimization [57.35348425288859]
RetroAct is a framework that jointly optimize both task-planning and self-reflective evolution capabilities in language agents.
We develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning.
We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
arXiv Detail & Related papers (2025-03-03T12:54:54Z) - LADs: Leveraging LLMs for AI-Driven DevOps [3.240228178267042]
LADs is a principled approach to configuration optimization through in-depth analysis of what optimization works under which conditions.
By leveraging Retrieval-Augmented Generation, Few-Shot Learning, Chain-of-Thought, and Feedback-Based Prompt Chaining, LADs generates accurate configurations and learns from deployment failures to iteratively refine system settings.
Our findings reveal key insights into the trade-offs between performance, cost, and scalability, helping practitioners determine the right strategies for different deployment scenarios.
arXiv Detail & Related papers (2025-02-28T08:12:08Z) - Towards more Contextual Agents: An extractor-Generator Optimization Framework [0.0]
Large Language Model (LLM)-based agents have demonstrated remarkable success in solving complex tasks across a wide range of general-purpose applications.
However, their performance often degrades in context-specific scenarios, such as specialized industries or research domains.
To address this challenge, our work introduces a systematic approach to enhance the contextual adaptability of LLM-based agents.
arXiv Detail & Related papers (2025-02-18T15:07:06Z) - SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning [21.94477076055433]
Multi-agent AI systems powered by large language models (LLMs) are increasingly applied to solve complex tasks.
We introduce SiriuS, a self-improving, reasoning-driven optimization framework for multi-agent systems.
We show that SiriuS enhances multi-agent performance while generating reusable data for self-correction and self-play enhancement.
arXiv Detail & Related papers (2025-02-07T09:33:44Z) - Agentic AI-Driven Technical Troubleshooting for Enterprise Systems: A Novel Weighted Retrieval-Augmented Generation Paradigm [0.0]
This paper presents a novel agentic AI solution built on a Weighted Retrieval-Augmented Generation (RAG) Framework tailored for enterprise technical troubleshooting.<n>By dynamically weighting retrieval sources such as product manuals, internal knowledge bases, FAQ, and troubleshooting guides, the framework prioritizes the most relevant data.<n>Preliminary evaluations on large enterprise datasets demonstrate the framework's efficacy in improving troubleshooting accuracy, reducing resolution times, and adapting to varied technical challenges.
arXiv Detail & Related papers (2024-12-16T17:32:38Z) - The BrowserGym Ecosystem for Web Agent Research [151.90034093362343]
BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents.<n>We conduct the first large-scale, multi-benchmark web agent experiment.<n>Results highlight a large discrepancy between OpenAI and Anthropic's latests models.
arXiv Detail & Related papers (2024-12-06T23:43:59Z) - An Empirical Study on LLM-based Agents for Automated Bug Fixing [2.433168823911037]
Large language models (LLMs) and LLM-based Agents have been applied to fix bugs automatically.
We examine seven proprietary and open-source systems on the SWE-bench Lite benchmark for automated bug fixing.
arXiv Detail & Related papers (2024-11-15T14:19:15Z) - Deep Insights into Automated Optimization with Large Language Models and Evolutionary Algorithms [3.833708891059351]
Large Language Models (LLMs) and Evolutionary Algorithms (EAs) offer promising new approach to overcome limitations and make optimization more automated.
LLMs act as dynamic agents that can generate, refine, and interpret optimization strategies.
EAs efficiently explore complex solution spaces through evolutionary operators.
arXiv Detail & Related papers (2024-10-28T09:04:49Z) - Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [117.94654815220404]
G"odel Agent is a self-evolving framework inspired by the G"odel machine.
G"odel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
arXiv Detail & Related papers (2024-10-06T10:49:40Z) - Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose AOP, a novel framework for agent-oriented planning in multi-agent systems.
In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy.
Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
arXiv Detail & Related papers (2024-10-03T04:07:51Z) - ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems [80.69865295743149]
This work attempts to study using LLM-based agents to design collaborative AI systems autonomously.<n>Based on ComfyBench, we develop ComfyAgent, a framework that empowers agents to autonomously design collaborative AI systems by generating.<n>While ComfyAgent achieves a comparable resolve rate to o1-preview and significantly surpasses other agents on ComfyBench, ComfyAgent has resolved only 15% of creative tasks.
arXiv Detail & Related papers (2024-09-02T17:44:10Z) - Optimizing Collaboration of LLM based Agents for Finite Element Analysis [1.5039745292757671]
This paper investigates the interactions between multiple agents within Large Language Models (LLMs) in the context of programming and coding tasks.
We utilize the AutoGen framework to facilitate communication among agents, evaluating different configurations based on the success rates from 40 random runs for each setup.
arXiv Detail & Related papers (2024-08-23T23:11:08Z) - EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms [55.77492625524141]
EvoAgent is a generic method to automatically extend specialized agents to multi-agent systems.
We show that EvoAgent can significantly enhance the task-solving capability of LLM-based agents.
arXiv Detail & Related papers (2024-06-20T11:49:23Z) - AIOS: LLM Agent Operating System [39.59087894012381]
This paper proposes the architecture of AIOS (LLM-based AI Agent Operating System) under the context of managing LLM-based agents.
It introduces a novel architecture for serving LLM-based agents by isolating resources and LLM-specific services from agent applications into an AIOS kernel.
Using AIOS can achieve up to 2.1x faster execution for serving agents built by various agent frameworks.
arXiv Detail & Related papers (2024-03-25T17:32:23Z) - A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration [55.35849138235116]
We propose automatically selecting a team of agents from candidates to collaborate in a dynamic communication structure toward different tasks and domains.
Specifically, we build a framework named Dynamic LLM-Powered Agent Network ($textDyLAN$) for LLM-powered agent collaboration.
We demonstrate that DyLAN outperforms strong baselines in code generation, decision-making, general reasoning, and arithmetic reasoning tasks with moderate computational cost.
arXiv Detail & Related papers (2023-10-03T16:05:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.