Related papers: Engineering LLM Powered Multi-agent Framework for Autonomous CloudOps

Engineering LLM Powered Multi-agent Framework for Autonomous CloudOps

URL: http://arxiv.org/abs/2501.08243v1
Date: Tue, 14 Jan 2025 16:30:10 GMT
Title: Engineering LLM Powered Multi-agent Framework for Autonomous CloudOps
Authors: Kannan Parthasarathy, Karthik Vaidhyanathan, Rudra Dhar, Venkat Krishnamachari, Basil Muhammed, Adyansh Kakran, Sreemaee Akshathala, Shrikara Arun, Sumant Dubey, Mohan Veerubhotla, Amey Karan,
Abstract summary: We leveraged GenAI to develop a GenAI-based solution for autonomous CloudOps for the existing MontyCloud system.<n>We developed MOYA, a multi-agent framework that balances autonomy with the necessary human control.<n>This framework integrates various internal and external systems and is optimized for factors like task orchestration, security, and error mitigation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cloud Operations (CloudOps) is a rapidly growing field focused on the automated management and optimization of cloud infrastructure which is essential for organizations navigating increasingly complex cloud environments. MontyCloud Inc. is one of the major companies in the CloudOps domain that leverages autonomous bots to manage cloud compliance, security, and continuous operations. To make the platform more accessible and effective to the customers, we leveraged the use of GenAI. Developing a GenAI-based solution for autonomous CloudOps for the existing MontyCloud system presented us with various challenges such as i) diverse data sources; ii) orchestration of multiple processes; and iii) handling complex workflows to automate routine tasks. To this end, we developed MOYA, a multi-agent framework that leverages GenAI and balances autonomy with the necessary human control. This framework integrates various internal and external systems and is optimized for factors like task orchestration, security, and error mitigation while producing accurate, reliable, and relevant insights by utilizing Retrieval Augmented Generation (RAG). Evaluations of our multi-agent system with the help of practitioners as well as using automated checks demonstrate enhanced accuracy, responsiveness, and effectiveness over non-agentic approaches across complex workflows.

Related papers

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience [71.82719117238307]
We propose SEAgent, an agentic self-evolving framework enabling computer-use agents to evolve through interactions with unfamiliar software.<n>We validate the effectiveness of SEAgent across five novel software environments within OS-World.<n>Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA.
arXiv Detail & Related papers (2025-08-06T17:58:46Z)
Cloud Infrastructure Management in the Age of AI Agents [8.243598669679354]
We make a case for developing AI agents powered by large language models (LLMs) to automate cloud infrastructure management tasks.<n>In a preliminary study, we investigate the potential for AI agents to use different cloud/user interfaces.<n>We report takeaways on their effectiveness on different management tasks, and identify research challenges and potential solutions.
arXiv Detail & Related papers (2025-06-13T22:50:12Z)
Toward Super Agent System with Hybrid AI Routers [19.22599167969104]
Super agents can fulfill diverse user needs, such as summarization, coding, and research. To make such an agent viable for real-world deployment and accessible at scale, significant optimizations are required. This paper presents a design of the Super Agent System.
arXiv Detail & Related papers (2025-04-11T00:54:56Z)
AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds [12.464941027105306]
AI for IT Operations (AIOps) aims to automate complex operational tasks, such as fault localization and root cause analysis, to reduce human workload and minimize customer impact.<n>Recent advances in Large Language Models (LLMs) and AI agents are revolutionizing AIOps by enabling end-to-end and multitask automation.<n>We present AIOPSLAB, a framework that deploys microservice cloud environments, injects faults, generates workloads, and exports telemetry data but also orchestrates these components and provides interfaces for interacting with and evaluating agents.
arXiv Detail & Related papers (2025-01-12T04:17:39Z)
A Multi-AI Agent System for Autonomous Optimization of Agentic AI Solutions via Iterative Refinement and LLM-Driven Feedback Loops [3.729242965449096]
This paper introduces a framework for autonomously optimizing Agentic AI solutions across industries.<n>The framework achieves optimal performance without human input by autonomously generating and testing hypotheses.<n>Case studies show significant improvements in output quality, relevance, and actionability.
arXiv Detail & Related papers (2024-12-22T20:08:04Z)
Transforming the Hybrid Cloud for Emerging AI Workloads [81.15269563290326]
This white paper envisions transforming hybrid cloud systems to meet the growing complexity of AI workloads. The proposed framework addresses critical challenges in energy efficiency, performance, and cost-effectiveness. This joint initiative aims to establish hybrid clouds as secure, efficient, and sustainable platforms.
arXiv Detail & Related papers (2024-11-20T11:57:43Z)
AutoGLM: Autonomous Foundation Agents for GUIs [51.276965515952]
We present AutoGLM, a new series in the ChatGLM family, designed to serve as foundation agents for autonomous control of digital devices through Graphical User Interfaces (GUIs) We have developed AutoGLM as a practical foundation agent system for real-world GUI interactions. Our evaluations demonstrate AutoGLM's effectiveness across multiple domains.
arXiv Detail & Related papers (2024-10-28T17:05:10Z)
Very Large-Scale Multi-Agent Simulation in AgentScope [112.98986800070581]
We develop new features and components for AgentScope, a user-friendly multi-agent platform. We propose an actor-based distributed mechanism towards great scalability and high efficiency. We also provide a web-based interface for conveniently monitoring and managing a large number of agents.
arXiv Detail & Related papers (2024-07-25T05:50:46Z)
Building AI Agents for Autonomous Clouds: Challenges and Design Principles [17.03870042416836]
AI for IT Operations (AIOps) aims to automate complex operational tasks, like fault localization and root cause analysis, thereby reducing human intervention and customer impact. This vision paper lays the groundwork for such a framework by first framing the requirements and then discussing design decisions. We propose AIOpsLab, a prototype implementation leveraging agent-cloud-interface that orchestrates an application, injects real-time faults using chaos engineering, and interfaces with an agent to localize and resolve the faults.
arXiv Detail & Related papers (2024-07-16T20:40:43Z)
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents. We propose the Internet of Agents (IoA), a novel framework that addresses these limitations. IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z)
EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms [55.77492625524141]
EvoAgent is a generic method to automatically extend specialized agents to multi-agent systems. We show that EvoAgent can significantly enhance the task-solving capability of LLM-based agents.
arXiv Detail & Related papers (2024-06-20T11:49:23Z)
AgentScope: A Flexible yet Robust Multi-Agent Platform [66.64116117163755]
AgentScope is a developer-centric multi-agent platform with message exchange as its core communication mechanism. The abundant syntactic tools, built-in agents and service functions, user-friendly interfaces for application demonstration and utility monitor, zero-code programming workstation, and automatic prompt tuning mechanism significantly lower the barriers to both development and deployment.
arXiv Detail & Related papers (2024-02-21T04:11:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.