Related papers: CCA: Collaborative Competitive Agents for Image Editing

CCA: Collaborative Competitive Agents for Image Editing

URL: http://arxiv.org/abs/2401.13011v2
Date: Sat, 15 Feb 2025 13:26:28 GMT
Title: CCA: Collaborative Competitive Agents for Image Editing
Authors: Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo,
Abstract summary: This paper presents a novel generative model, Collaborative Competitive Agents (CCA)<n>It leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.<n>The paper's main contributions include the introduction of a multi-agent-based generative model with controllable intermediate steps and iterative optimization.
Score: 55.500493143796405
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks. Drawing inspiration from Generative Adversarial Networks (GANs), the CCA system employs two equal-status generator agents and a discriminator agent. The generators independently process user instructions and generate results, while the discriminator evaluates the outputs, and provides feedback for the generator agents to further reflect and improve the generation results. Unlike the previous generative model, our system can obtain the intermediate steps of generation. This allows each generator agent to learn from other successful executions due to its transparency, enabling a collaborative competition that enhances the quality and robustness of the system's results. The primary focus of this study is image editing, demonstrating the CCA's ability to handle intricate instructions robustly. The paper's main contributions include the introduction of a multi-agent-based generative model with controllable intermediate steps and iterative optimization, a detailed examination of agent relationships, and comprehensive experiments on image editing. Code is available at \href{https://github.com/TiankaiHang/CCA}{https://github.com/TiankaiHang/CCA}.

Related papers

GenerationPrograms: Fine-grained Attribution with Executable Programs [72.23792263905372]
We introduce a modular generation framework, GenerationPrograms, inspired by recent advancements in "code agent" architectures.<n>GenerationPrograms decomposes the process into two distinct stages: first, creating an executable program plan composed of modular text operations explicitly tailored to the query, and second, executing these operations following the program's specified instructions to produce the final response.<n> Empirical evaluations demonstrate that GenerationPrograms significantly improves attribution quality at both the document level and sentence level.
arXiv Detail & Related papers (2025-06-17T14:37:09Z)
MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning [43.66966457772646]
MA-RAG orchestrates a collaborative set of specialized AI agents to tackle each stage of the RAG pipeline with task-aware reasoning.<n>Our design allows fine-grained control over information flow without any model fine-tuning.<n>This modular and reasoning-driven architecture enables MA-RAG to deliver robust, interpretable results.
arXiv Detail & Related papers (2025-05-26T15:05:18Z)
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration [63.31211701741323]
We extend multi-agent multi-model reasoning to generation, specifically to improving faithfulness through refinement. We design intrinsic evaluations for each subtask, with our findings indicating that both multi-agent (multiple instances) and multi-model (diverse LLM types) approaches benefit error detection and critiquing. We consolidate these insights into a final "recipe" called Multi-Agent Multi-Model Refinement (MAMM-Refine), where multi-agent and multi-model collaboration significantly boosts performance.
arXiv Detail & Related papers (2025-03-19T14:46:53Z)
GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration [20.988801611785522]
We propose GenMAC, an iterative, multi-agent framework that enables compositional text-to-video generation. The collaborative workflow includes three stages: Design, Generation, and Redesign. To tackle diverse scenarios of compositional text-to-video generation, we design a self-routing mechanism to adaptively select the proper correction agent from a collection of correction agents each specialized for one scenario.
arXiv Detail & Related papers (2024-12-05T18:56:05Z)
Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks [39.084974125007165]
We introduce Magentic-One, a high-performing open-source agentic system for solving complex tasks. Magentic-One uses a multi-agent architecture where a lead agent, the Orchestrator, tracks progress, and re-plans to recover from errors. We show that Magentic-One achieves statistically competitive performance to the state-of-the-art on three diverse and challenging agentic benchmarks.
arXiv Detail & Related papers (2024-11-07T06:36:19Z)
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [117.94654815220404]
G"odel Agent is a self-evolving framework inspired by the G"odel machine. G"odel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
arXiv Detail & Related papers (2024-10-06T10:49:40Z)
Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework. At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence. We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z)
GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI [64.57616646552869]
This paper explores collaborative AI systems that use to enhance performance to integrate models, data sources, and pipelines to solve complex and diverse tasks. We introduce GenAgent, an LLM-based framework that automatically generates complex, offering greater flexibility and scalability compared to monolithic models. The results demonstrate that GenAgent outperforms baseline approaches in both run-level and task-level evaluations.
arXiv Detail & Related papers (2024-09-02T17:44:10Z)
Scaling Large-Language-Model-based Multi-Agent Collaboration [75.5241464256688]
Pioneering advancements in large language model-powered agents have underscored the design pattern of multi-agent collaboration. Inspired by the neural scaling law, this study investigates whether a similar principle applies to increasing agents in multi-agent collaboration.
arXiv Detail & Related papers (2024-06-11T11:02:04Z)
Concept Matching with Agent for Out-of-Distribution Detection [19.407364109506904]
We propose a new method that integrates the agent paradigm into out-of-distribution (OOD) detection task. Our proposed method, Concept Matching with Agent (CMA), employs neutral prompts as agents to augment the CLIP-based OOD detection process. Our extensive experimental results showcase the superior performance of CMA over both zero-shot and training-required methods.
arXiv Detail & Related papers (2024-05-27T02:27:28Z)
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation [72.6168579583414]
CompAgent is a training-free approach for compositional text-to-image generation with a large language model (LLM) agent as its core. Our approach achieves more than 10% improvement on T2I-CompBench, a comprehensive benchmark for open-world compositional T2I generation.
arXiv Detail & Related papers (2024-01-28T16:18:39Z)
Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning [9.774412108791218]
We propose a novel framework for value function factorization in deep reinforcement learning. In particular, we consider the team of agents as the set of nodes of a complete directed graph. We introduce a mixing GNN module, which is responsible for i) factorizing the team state-action value function into individual per-agent observation-action value functions, and ii) explicit credit assignment to each agent in terms of fractions of the global team reward.
arXiv Detail & Related papers (2020-10-09T18:01:01Z)
Multi-Agent Interactions Modeling with Correlated Policies [53.38338964628494]
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework. We develop a Decentralized Adrial Imitation Learning algorithm with Correlated policies (CoDAIL) Various experiments demonstrate that CoDAIL can better regenerate complex interactions close to the demonstrators.
arXiv Detail & Related papers (2020-01-04T17:31:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.