Related papers: CausalQuest: Collecting Natural Causal Questions for AI Agents

CausalQuest: Collecting Natural Causal Questions for AI Agents

URL: http://arxiv.org/abs/2405.20318v1
Date: Thu, 30 May 2024 17:55:28 GMT
Title: CausalQuest: Collecting Natural Causal Questions for AI Agents
Authors: Roberto Ceraolo, Dmitrii Kharlapenko, Amélie Reymond, Rada Mihalcea, Mrinmaya Sachan, Bernhard Schölkopf, Zhijing Jin,
Abstract summary: CausalQuest is a dataset of 13,500 naturally occurring questions sourced from social networks, search engines, and AI assistants. We formalize the definition of causal questions and establish a taxonomy for finer-grained classification. We find that 42% of the questions humans ask are indeed causal, with the majority seeking to understand the causes behind given effects.
Score: 95.34262362200695
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Humans have an innate drive to seek out causality. Whether fuelled by curiosity or specific goals, we constantly question why things happen, how they are interconnected, and many other related phenomena. To develop AI agents capable of addressing this natural human quest for causality, we urgently need a comprehensive dataset of natural causal questions. Unfortunately, existing datasets either contain only artificially-crafted questions that do not reflect real AI usage scenarios or have limited coverage of questions from specific sources. To address this gap, we present CausalQuest, a dataset of 13,500 naturally occurring questions sourced from social networks, search engines, and AI assistants. We formalize the definition of causal questions and establish a taxonomy for finer-grained classification. Through a combined effort of human annotators and large language models (LLMs), we carefully label the dataset. We find that 42% of the questions humans ask are indeed causal, with the majority seeking to understand the causes behind given effects. Using this dataset, we train efficient classifiers (up to 2.85B parameters) for the binary task of identifying causal questions, achieving high performance with F1 scores of up to 0.877. We conclude with a rich set of future research directions that can build upon our data and models.

Related papers

Exploring Human-LLM Conversations: Mental Models and the Originator of Toxicity [1.4003044924094596]
This study explores real-world human interactions with large language models (LLMs) in diverse, unconstrained settings. Our findings show that although LLMs are rightfully accused of providing toxic content, it is mostly demanded or at least provoked by humans who actively seek such content.
arXiv Detail & Related papers (2024-07-08T14:20:05Z)
Qsnail: A Questionnaire Dataset for Sequential Question Generation [76.616068047362]
We present the first dataset specifically constructed for the questionnaire generation task, which comprises 13,168 human-written questionnaires. We conduct experiments on Qsnail, and the results reveal that retrieval models and traditional generative models do not fully align with the given research topic and intents. Despite enhancements through the chain-of-thought prompt and finetuning, questionnaires generated by language models still fall short of human-written questionnaires.
arXiv Detail & Related papers (2024-02-22T04:14:10Z)
A Comparative and Experimental Study on Automatic Question Answering Systems and its Robustness against Word Jumbling [0.49157446832511503]
Question answer generation is highly relevant because a frequently asked questions (FAQ) list can only have a finite amount of questions. A model which can perform question answer generation could be able to answer completely new questions that are within the scope of the data. In commercial applications, it can be used to increase customer satisfaction and ease of usage. However a lot of data is generated by humans so it is susceptible to human error and this can adversely affect the model's performance.
arXiv Detail & Related papers (2023-11-27T03:17:09Z)
FOLLOWUPQG: Towards Information-Seeking Follow-up Question Generation [38.78216651059955]
We introduce the task of real-world information-seeking follow-up question generation (FQG) We construct FOLLOWUPQG, a dataset of over 3K real-world (initial question, answer, follow-up question)s collected from a forum layman providing Reddit-friendly explanations for open-ended questions. In contrast to existing datasets, questions in FOLLOWUPQG use more diverse pragmatic strategies to seek information, and they also show higher-order cognitive skills.
arXiv Detail & Related papers (2023-09-10T11:58:29Z)
Overinformative Question Answering by Humans and Machines [26.31070412632125]
We show that overinformativeness in human answering is driven by considerations of relevance to the questioner's goals. We show that GPT-3 is highly sensitive to the form of the prompt and only human-like answer patterns when guided by an example and cognitively-motivated explanation.
arXiv Detail & Related papers (2023-05-11T21:41:41Z)
WebCPM: Interactive Web Search for Chinese Long-form Question Answering [104.676752359777]
Long-form question answering (LFQA) aims at answering complex, open-ended questions with detailed, paragraph-length responses. We introduce WebCPM, the first Chinese LFQA dataset. We collect 5,500 high-quality question-answer pairs, together with 14,315 supporting facts and 121,330 web search actions.
arXiv Detail & Related papers (2023-05-11T14:47:29Z)
Zero-shot Clarifying Question Generation for Conversational Search [25.514678546942754]
We propose a constrained clarifying question generation system which uses both question templates and query facets to guide the effective and precise question generation. Experiment results show that our method outperforms existing state-of-the-art zero-shot baselines by a large margin.
arXiv Detail & Related papers (2023-01-30T04:43:02Z)
JECC: Commonsense Reasoning Tasks Derived from Interactive Fictions [75.42526766746515]
We propose a new commonsense reasoning dataset based on human's Interactive Fiction (IF) gameplay walkthroughs. Our dataset focuses on the assessment of functional commonsense knowledge rules rather than factual knowledge. Experiments show that the introduced dataset is challenging to previous machine reading models as well as the new large language models.
arXiv Detail & Related papers (2022-10-18T19:20:53Z)
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering. Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z)
Evaluating Mixed-initiative Conversational Search Systems via User Simulation [9.066817876491053]
We propose a conversational User Simulator, called USi, for automatic evaluation of such search systems. We show that responses generated by USi are both inline with the underlying information need and comparable to human-generated answers.
arXiv Detail & Related papers (2022-04-17T16:27:33Z)
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers [66.11048565324468]
We present a dataset of 5,049 questions over 1,585 Natural Language Processing papers. Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text. We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers.
arXiv Detail & Related papers (2021-05-07T00:12:34Z)
Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document. We show that readers engage in a series of pragmatic strategies to seek information. We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.