Related papers: First Ask Then Answer: A Framework Design for AI Dialogue Based on Supplementary Questioning with Large Language Models

First Ask Then Answer: A Framework Design for AI Dialogue Based on Supplementary Questioning with Large Language Models

URL: http://arxiv.org/abs/2508.08308v1
Date: Fri, 08 Aug 2025 13:39:47 GMT
Title: First Ask Then Answer: A Framework Design for AI Dialogue Based on Supplementary Questioning with Large Language Models
Authors: Chuanruo Fu, Yuncheng Du,
Abstract summary: First Ask Then Answer (FATA) generates multidimensional supplementary questions for users prior to response generation.<n>In contrast to existing clarification approaches, FATA emphasizes completeness and user participation.<n> Experimental results show that FATA outperforms B-Prompt by approximately 40% in aggregate metrics.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) often struggle to deliver accurate and actionable answers when user-provided information is incomplete or ill-specified. We propose a new interaction paradigm, First Ask Then Answer (FATA), in which, through prompt words, LLMs are guided to proactively generate multidimensional supplementary questions for users prior to response generation. Subsequently, by integrating user-provided supplementary information with the original query through sophisticated prompting techniques, we achieve substantially improved response quality and relevance. In contrast to existing clarification approaches -- such as the CLAM framework oriented to ambiguity and the self-interrogation Self-Ask method -- FATA emphasizes completeness (beyond mere disambiguation) and user participation (inviting human input instead of relying solely on model-internal reasoning). It also adopts a single-turn strategy: all clarifying questions are produced at once, thereby reducing dialogue length and improving efficiency. Conceptually, FATA uses the reasoning power of LLMs to scaffold user expression, enabling non-expert users to formulate more comprehensive and contextually relevant queries. To evaluate FATA, we constructed a multi-domain benchmark and compared it with two controls: a baseline prompt (B-Prompt) and a context-enhanced expert prompt (C-Prompt). Experimental results show that FATA outperforms B-Prompt by approximately 40% in aggregate metrics and exhibits a coefficient of variation 8% lower than C-Prompt, indicating superior stability.

Related papers

ClarifyMT-Bench: Benchmarking and Improving Multi-Turn Clarification for Conversational Large Language Models [32.099137908375546]
ClarifyMT-Bench is a benchmark for multi-turn clarification in large language models (LLMs)<n>We construct 6,120 multi-turn dialogues capturing diverse ambiguity sources and interaction patterns.<n>We propose textbfClarifyAgent, an agentic approach that decomposes clarification into perception, forecasting, tracking, and planning.
arXiv Detail & Related papers (2025-12-24T11:39:00Z)
KnowMT-Bench: Benchmarking Knowledge-Intensive Long-Form Question Answering in Multi-Turn Dialogues [58.305425399644086]
Multi-Turn Long-Form Question Answering (MT-LFQA) is a key application paradigm of Large Language Models (LLMs) in knowledge-intensive domains.<n>We introduce textbfKnowMT-Bench, the textitfirst-ever benchmark designed to systematically evaluate MT-LFQA for LLMs across knowledge-intensive fields.
arXiv Detail & Related papers (2025-09-26T04:32:29Z)
Pathways of Thoughts: Multi-Directional Thinking for Long-form Personalized Question Answering [57.12316804290369]
Personalization is essential for adapting question answering systems to user-specific information needs.<n>We propose Pathways of Thoughts (PoT), an inference-stage method that applies to any large language model (LLM) without requiring task-specific fine-tuning.<n>PoT consistently outperforms competitive baselines, achieving up to a 13.1% relative improvement.
arXiv Detail & Related papers (2025-09-23T14:44:46Z)
Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z)
Contextual Candor: Enhancing LLM Trustworthiness Through Hierarchical Unanswerability Detection [0.0]
This paper introduces Reinforced Unanswerability Learning (RUL), a novel hybrid training paradigm for large language models (LLMs)<n>RUL integrates a discriminative unanswerability prediction head with the LLM's generative core, guided by a multi-stage learning strategy.<n>Experiments demonstrate RUL's superior performance, achieving significantly higher accuracy in unanswerability detection across sentence, paragraph, and ranking levels.
arXiv Detail & Related papers (2025-06-01T17:59:27Z)
CLEAR-KGQA: Clarification-Enhanced Ambiguity Resolution for Knowledge Graph Question Answering [13.624962763072899]
KGQA systems typically assume user queries are unambiguous, which is an assumption that rarely holds in real-world applications.<n>We propose a novel framework that dynamically handles both entity ambiguity (e.g., distinguishing between entities with similar names) and intent ambiguity (e.g., clarifying different interpretations of user queries) through interactive clarification.
arXiv Detail & Related papers (2025-04-13T17:34:35Z)
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization [30.748085697067154]
We propose a Multi-Agent framework incorporating Socratic guidance (MARS)<n>MARS comprises seven agents, each with distinct functionalities, which autonomously use the Planner to devise an optimization path.<n>We conduct extensive experiments on various datasets to validate the effectiveness of our method.
arXiv Detail & Related papers (2025-03-21T06:19:55Z)
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training [33.57497419019826]
Action-Based Contrastive Self-Training enables data-efficient dialogue policy learning in multi-turn conversation modeling.<n>We demonstrate ACT's efficacy under in data-efficient tuning scenarios, even when there is no action label available.<n>We also propose evaluating LLMs' ability to function as conversational agents by examining whether they can implicitly recognize and reason about ambiguity in conversation.
arXiv Detail & Related papers (2024-05-31T22:44:48Z)
PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities. We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework. We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z)
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective of query dependency in such optimization. We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z)
Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs) Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process. We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z)
Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response. We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English. Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.