Related papers: From Chat Logs to Collective Insights: Aggregative Question Answering

From Chat Logs to Collective Insights: Aggregative Question Answering

URL: http://arxiv.org/abs/2505.23765v1
Date: Thu, 29 May 2025 17:59:55 GMT
Title: From Chat Logs to Collective Insights: Aggregative Question Answering
Authors: Wentao Zhang, Woojeong Kim, Yuntian Deng,
Abstract summary: We introduce Aggregative Question Answering, a novel task requiring models to reason explicitly over thousands of user-chatbot interactions to answer aggregative queries.<n>To enable research in this direction, we construct a benchmark, WildChat-AQA, comprising 6,027 aggregative questions derived from 182,330 real-world conversations.
Score: 28.700113669309314
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conversational agents powered by large language models (LLMs) are rapidly becoming integral to our daily interactions, generating unprecedented amounts of conversational data. Such datasets offer a powerful lens into societal interests, trending topics, and collective concerns. Yet, existing approaches typically treat these interactions as independent and miss critical insights that could emerge from aggregating and reasoning across large-scale conversation logs. In this paper, we introduce Aggregative Question Answering, a novel task requiring models to reason explicitly over thousands of user-chatbot interactions to answer aggregative queries, such as identifying emerging concerns among specific demographics. To enable research in this direction, we construct a benchmark, WildChat-AQA, comprising 6,027 aggregative questions derived from 182,330 real-world chatbot conversations. Experiments show that existing methods either struggle to reason effectively or incur prohibitive computational costs, underscoring the need for new approaches capable of extracting collective insights from large-scale conversational data.

Related papers

REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation [51.97224538045096]
We introduce REALTALK, a 21-day corpus of authentic messaging app dialogues.<n>We compare EI attributes and persona consistency to understand the challenges posed by real-world dialogues.<n>Our findings reveal that models struggle to simulate a user solely from dialogue history, while fine-tuning on specific user chats improves persona emulation.
arXiv Detail & Related papers (2025-02-18T20:29:01Z)
InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context [4.262907114077643]
Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests.<n>We introduce InfoQuest, a benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests.
arXiv Detail & Related papers (2025-02-17T19:01:10Z)
ProCIS: A Benchmark for Proactive Retrieval in Conversations [21.23826888841565]
We introduce a large-scale dataset for proactive document retrieval that consists of over 2.8 million conversations. We conduct crowdsourcing experiments to obtain high-quality and relatively complete relevance judgments. We also collect annotations related to the parts of the conversation that are related to each document, enabling us to evaluate proactive retrieval systems.
arXiv Detail & Related papers (2024-05-10T13:11:07Z)
Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation [25.578440131793858]
Existing conversational dense retrieval models mostly view a conversation as a fixed sequence of questions and responses. We propose a framework for generalizing Conversational dense retrieval via LLM-cognition data Augmentation (ConvAug) Inspired by human cognition, we devise a cognition-aware process to mitigate the generation of false positives, false negatives, and hallucinations.
arXiv Detail & Related papers (2024-02-11T03:27:22Z)
Social Commonsense-Guided Search Query Generation for Open-Domain Knowledge-Powered Conversations [66.16863141262506]
We present a novel approach that focuses on generating internet search queries guided by social commonsense. Our proposed framework addresses passive user interactions by integrating topic tracking, commonsense response generation and instruction-driven query generation.
arXiv Detail & Related papers (2023-10-22T16:14:56Z)
AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation. Specifically, we formulate the conversation generation problem as a language modeling task. We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z)
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering. Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z)
INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions [47.90088587508672]
InSCIt is a dataset for Information-Seeking Conversations with mixed-initiative Interactions. It contains 4.7K user-agent turns from 805 human-human conversations. We report results of two systems based on state-of-the-art models of conversational knowledge identification and open-domain question answering.
arXiv Detail & Related papers (2022-07-02T06:18:12Z)
Disentangling Online Chats with DAG-Structured LSTMs [55.33014148383343]
DAG-LSTMs are a generalization of Tree-LSTMs that can handle directed acyclic dependencies. We show that the novel model we propose achieves state of the art status on the task of recovering reply-to relations.
arXiv Detail & Related papers (2021-06-16T18:00:00Z)
Summary Grounded Conversation Generation [10.470157142861174]
We show how pre-trained language models can be used to generate entire conversations, given only a summary of a conversation as the input. We also show that the accuracy of conversation summarization can be improved by augmenting a conversation summarization dataset with generated conversations.
arXiv Detail & Related papers (2021-06-07T04:46:31Z)
Conversations with Search Engines: SERP-based Conversational Response Generation [77.1381159789032]
We create a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines. We also develop a state-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE) using this dataset. CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator.
arXiv Detail & Related papers (2020-04-29T13:07:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.