A Survey of Large Language Model Agents for Question Answering
- URL: http://arxiv.org/abs/2503.19213v1
- Date: Mon, 24 Mar 2025 23:39:44 GMT
- Title: A Survey of Large Language Model Agents for Question Answering
- Authors: Murong Yue,
- Abstract summary: This paper surveys the development of large language model (LLM)-based agents for question answering (QA)<n>Traditional agents face significant limitations, including substantial data requirements and difficulty in generalizing to new environments.<n>LLM-based agents address these challenges by leveraging LLMs as their core reasoning engine.
- Score: 0.7416846035207727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper surveys the development of large language model (LLM)-based agents for question answering (QA). Traditional agents face significant limitations, including substantial data requirements and difficulty in generalizing to new environments. LLM-based agents address these challenges by leveraging LLMs as their core reasoning engine. These agents achieve superior QA results compared to traditional QA pipelines and naive LLM QA systems by enabling interaction with external environments. We systematically review the design of LLM agents in the context of QA tasks, organizing our discussion across key stages: planning, question understanding, information retrieval, and answer generation. Additionally, this paper identifies ongoing challenges and explores future research directions to enhance the performance of LLM agent QA systems.
Related papers
- SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA [2.7703990035016868]
We introduce SUNAR, a novel approach that leverages large language models to guide a Neighborhood Aware Retrieval process.<n>We validate our approach through extensive experiments on two complex QA datasets.<n>Our results show that SUNAR significantly outperforms existing retrieve-and-reason baselines, achieving up to a 31.84% improvement in performance.
arXiv Detail & Related papers (2025-03-23T08:50:44Z) - AGENT-CQ: Automatic Generation and Evaluation of Clarifying Questions for Conversational Search with LLMs [53.6200736559742]
AGENT-CQ consists of two stages: a generation stage and an evaluation stage.
CrowdLLM simulates human crowdsourcing judgments to assess generated questions and answers.
Experiments on the ClariQ dataset demonstrate CrowdLLM's effectiveness in evaluating question and answer quality.
arXiv Detail & Related papers (2024-10-25T17:06:27Z) - Seek and Solve Reasoning for Table Question Answering [49.006950918895306]
This paper reveals that the reasoning process during task simplification may be more valuable than the simplified tasks themselves.<n>We propose a Seek-and-solving pipeline that instructs the LLM to first seek relevant information and then answer questions.<n>We distill a single-step TQA-solving prompt from this pipeline, using demonstrations with SS-CoT paths to guide the LLM in solving complex TQA tasks.
arXiv Detail & Related papers (2024-09-09T02:41:00Z) - Large Language Model-Based Agents for Software Engineering: A Survey [20.258244647363544]
The recent advance in Large Language Models (LLMs) has shaped a new paradigm of AI agents, i.e., LLM-based agents.
We collect 106 papers and categorize them from two perspectives, i.e., the SE and agent perspectives.
In addition, we discuss open challenges and future directions in this critical domain.
arXiv Detail & Related papers (2024-09-04T15:59:41Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Automatic Question-Answer Generation for Long-Tail Knowledge [65.11554185687258]
We propose an automatic approach to generate specialized QA datasets for tail entities.
We conduct extensive experiments by employing pretrained LLMs on our newly generated long-tail QA datasets.
arXiv Detail & Related papers (2024-03-03T03:06:31Z) - Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question
Answering Benchmark [69.3415799675046]
We introduce CDQA, a Chinese Dynamic QA benchmark containing question-answer pairs related to the latest news on the Chinese Internet.
We obtain high-quality data through a pipeline that combines humans and models.
We have also evaluated and analyzed mainstream and advanced Chinese LLMs on CDQA.
arXiv Detail & Related papers (2024-02-29T15:22:13Z) - Large Language Model based Multi-Agents: A Survey of Progress and Challenges [44.92286030322281]
Large Language Models (LLMs) have achieved remarkable success across a wide array of tasks.
Recently, based on the development of using one LLM as a single planning or decision-making agent, LLM-based multi-agent systems have achieved considerable progress in complex problem-solving and world simulation.
arXiv Detail & Related papers (2024-01-21T23:36:14Z) - A Survey on Large Language Model based Autonomous Agents [105.2509166861984]
Large language models (LLMs) have demonstrated remarkable potential in achieving human-level intelligence.<n>This paper delivers a systematic review of the field of LLM-based autonomous agents from a holistic perspective.<n>We present a comprehensive overview of the diverse applications of LLM-based autonomous agents in the fields of social science, natural science, and engineering.
arXiv Detail & Related papers (2023-08-22T13:30:37Z) - Enhancing Trust in LLM-Based AI Automation Agents: New Considerations
and Future Challenges [2.6212127510234797]
In the field of process automation, a new generation of AI-based agents has emerged, enabling the execution of complex tasks.
This paper analyzes the main aspects of trust in AI agents discussed in existing literature, and identifies specific considerations and challenges relevant to this new generation of automation agents.
arXiv Detail & Related papers (2023-08-10T07:12:11Z) - AgentBench: Evaluating LLMs as Agents [88.45506148281379]
Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks.
We present AgentBench, a benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities.
arXiv Detail & Related papers (2023-08-07T16:08:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.