More Agents Is All You Need
- URL: http://arxiv.org/abs/2402.05120v1
- Date: Sat, 3 Feb 2024 05:55:24 GMT
- Title: More Agents Is All You Need
- Authors: Junyou Li, Qin Zhang, Yangbin Yu, Qiang Fu, Deheng Ye
- Abstract summary: We find that, simply via a sampling-and-voting method, the performance of large language models (LLMs) scales with the number of agents instantiated.
- Score: 17.564711490225612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We find that, simply via a sampling-and-voting method, the performance of
large language models (LLMs) scales with the number of agents instantiated.
Also, this method is orthogonal to existing complicated methods to further
enhance LLMs, while the degree of enhancement is correlated to the task
difficulty. We conduct comprehensive experiments on a wide range of LLM
benchmarks to verify the presence of our finding, and to study the properties
that can facilitate its occurrence. Our code is publicly available at:
\url{https://anonymous.4open.science/r/more_agent_is_all_you_need}.
Related papers
- DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph [70.79413606968814]
We introduce Dynamic Evaluation of LLMs via Adaptive Reasoning Graph Evolvement (DARG) to dynamically extend current benchmarks with controlled complexity and diversity.
Specifically, we first extract the reasoning graphs of data points in current benchmarks and then perturb the reasoning graphs to generate novel testing data.
Such newly generated test samples can have different levels of complexity while maintaining linguistic diversity similar to the original benchmarks.
arXiv Detail & Related papers (2024-06-25T04:27:53Z) - AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval [93.96463520716759]
Large language model (LLM) agents have demonstrated impressive capability in utilizing external tools and knowledge to boost accuracy and reduce hallucinations.
Here, we introduce AvaTaR, a novel framework that optimize an LLM agent to effectively use the provided tools and improve its performance on a given task/domain.
We find AvaTaR consistently outperforms state-of-the-art approaches across all four challenging tasks and exhibits strong generalization ability when applied to novel cases.
arXiv Detail & Related papers (2024-06-17T04:20:02Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - LLM-based Multi-Agent Reinforcement Learning: Current and Future Directions [8.55917897789612]
We focus on the cooperative tasks of multiple agents with a common goal and communication among them.
We also consider human-in/on-the-loop scenarios enabled by the language component in the framework.
arXiv Detail & Related papers (2024-05-17T22:10:23Z) - AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents [19.439775106707344]
AgentQuest is a framework where benchmarks and metrics are modular and easily through well documented and easy-to-use APIs.
We offer two new evaluation metrics that can reliably track LLM agent progress while solving a task.
We exemplify the utility of the metrics on two use cases wherein we identify common failure points and refine the agent architecture to obtain a significant performance increase.
arXiv Detail & Related papers (2024-04-09T16:01:24Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - LLMAuditor: A Framework for Auditing Large Language Models Using Human-in-the-Loop [7.77005079649294]
An effective method is to probe the Large Language Models using different versions of the same question.
To operationalize this auditing method at scale, we need an approach to create those probes reliably and automatically.
We propose the LLMAuditor framework, where one uses a different LLM along with human-in-the-loop (HIL)
This approach offers verifiability and transparency, while avoiding circular reliance on the same LLM.
arXiv Detail & Related papers (2024-02-14T17:49:31Z) - Large Language Model based Multi-Agents: A Survey of Progress and Challenges [44.92286030322281]
Large Language Models (LLMs) have achieved remarkable success across a wide array of tasks.
Recently, based on the development of using one LLM as a single planning or decision-making agent, LLM-based multi-agent systems have achieved considerable progress in complex problem-solving and world simulation.
arXiv Detail & Related papers (2024-01-21T23:36:14Z) - LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient
Querying [71.86163159193327]
Large language models (LLMs) have recently demonstrated their impressive ability to provide context-aware responses via text.
This ability could potentially be used to predict plausible solutions in sequential decision making tasks pertaining to pattern completion.
We introduce LaGR, which uses this predictive ability of LLMs to propose solutions to tasks that have been partially completed by a primary reinforcement learning (RL) agent.
arXiv Detail & Related papers (2023-08-21T02:07:35Z) - Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance.
Our approach involves adapting the previous prompt-based representation method for autoregressive models.
By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z) - Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making
using Language Guided World Modelling [101.59430768507997]
Reinforcement learning (RL) agents typically learn tabula rasa, without prior knowledge of the world.
We propose using few-shot large language models (LLMs) to hypothesize an Abstract World Model (AWM)
Our method of hypothesizing an AWM with LLMs and then verifying the AWM based on agent experience not only increases sample efficiency over contemporary methods by an order of magnitude.
arXiv Detail & Related papers (2023-01-28T02:04:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.