WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation
- URL: http://arxiv.org/abs/2412.01626v3
- Date: Sun, 20 Apr 2025 19:43:24 GMT
- Title: WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation
- Authors: Jamshid Mozafari, Florian Gerhold, Adam Jatowt,
- Abstract summary: We first introduce a manually constructed hint dataset, WikiHint, which is based on Wikipedia and includes 5,000 hints created for 1,000 questions.<n>We assess the effectiveness of the hints with human participants who answer questions with and without the aid of hints.<n>Our findings show that (a) the dataset helps generate more effective hints, (b) including answer information along with questions generally improves the quality of generated hints, and (c) encoder-based models perform better than decoder-based models in hint ranking.
- Score: 15.144785147549713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of Large Language Models (LLMs) has increased significantly with users frequently asking questions to chatbots. In the time when information is readily accessible, it is crucial to stimulate and preserve human cognitive abilities and maintain strong reasoning skills. This paper addresses such challenges by promoting the use of hints as an alternative or a supplement to direct answers. We first introduce a manually constructed hint dataset, WikiHint, which is based on Wikipedia and includes 5,000 hints created for 1,000 questions. We then finetune open-source LLMs for hint generation in answer-aware and answer-agnostic contexts. We assess the effectiveness of the hints with human participants who answer questions with and without the aid of hints. Additionally, we introduce a lightweight evaluation method, HintRank, to evaluate and rank hints in both answer-aware and answer-agnostic settings. Our findings show that (a) the dataset helps generate more effective hints, (b) including answer information along with questions generally improves the quality of generated hints, and (c) encoder-based models perform better than decoder-based models in hint ranking.
Related papers
- HintEval: A Comprehensive Framework for Hint Generation and Evaluation for Questions [16.434748534272014]
HintEval is a Python library that makes it easy to access diverse datasets and provides multiple approaches to generate and evaluate hints.
By reducing barriers to entry, HintEval offers a major step forward for facilitating hint generation and analysis research within the NLP/IR community.
arXiv Detail & Related papers (2025-02-02T17:07:18Z) - Optimizing Language Model's Reasoning Abilities with Weak Supervision [48.60598455782159]
We present textscPuzzleBen, a weakly supervised benchmark that comprises 25,147 complex questions, answers, and human-generated rationales.
A unique aspect of our dataset is the inclusion of 10,000 unannotated questions, enabling us to explore utilizing fewer supersized data to boost LLMs' inference capabilities.
arXiv Detail & Related papers (2024-05-07T07:39:15Z) - TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions [20.510164413931577]
We introduce a framework for the automatic hint generation for factoid questions.
We construct a novel large-scale dataset featuring 160,230 hints corresponding to 16,645 questions from the TriviaQA dataset.
To evaluate the TriviaHG dataset and the proposed evaluation method, we enlisted 10 individuals to annotate 2,791 hints and tasked 6 humans with answering questions using the provided hints.
arXiv Detail & Related papers (2024-03-27T10:27:28Z) - KIWI: A Dataset of Knowledge-Intensive Writing Instructions for
Answering Research Questions [63.307317584926146]
Large language models (LLMs) adapted to follow user instructions are now widely deployed as conversational agents.
In this work, we examine one increasingly common instruction-following task: providing writing assistance to compose a long-form answer.
We construct KIWI, a dataset of knowledge-intensive writing instructions in the scientific domain.
arXiv Detail & Related papers (2024-03-06T17:16:44Z) - Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations [70.6395572287422]
Self-alignment method is capable of not only refusing to answer but also providing explanation to the unanswerability of unknown questions.
We conduct disparity-driven self-curation to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired.
arXiv Detail & Related papers (2024-02-23T02:24:36Z) - Large Language Models Meet Knowledge Graphs to Answer Factoid Questions [57.47634017738877]
We propose a method for exploring pre-trained Text-to-Text Language Models enriched with additional information from Knowledge Graphs.
We procure easily interpreted information with Transformer-based models through the linearization of the extracted subgraphs.
Final re-ranking of the answer candidates with the extracted information boosts Hits@1 scores of the pre-trained text-to-text language models by 4-6%.
arXiv Detail & Related papers (2023-10-03T15:57:00Z) - What should I Ask: A Knowledge-driven Approach for Follow-up Questions
Generation in Conversational Surveys [63.51903260461746]
We propose a novel task for knowledge-driven follow-up question generation in conversational surveys.
We constructed a new human-annotated dataset of human-written follow-up questions with dialogue history and labeled knowledge.
We then propose a two-staged knowledge-driven model for the task, which generates informative and coherent follow-up questions.
arXiv Detail & Related papers (2022-05-23T00:57:33Z) - Question Answering Survey: Directions, Challenges, Datasets, Evaluation
Matrices [0.0]
The research directions of QA field are analyzed based on the type of question, answer type, source of evidence-answer, and modeling approach.
This detailed followed by open challenges of the field like automatic question generation, similarity detection and, low resource availability for a language.
arXiv Detail & Related papers (2021-12-07T08:53:40Z) - Generating Usage-related Questions for Preference Elicitation in Conversational Recommender Systems [19.950705852361565]
We propose a novel approach to preference elicitation by asking implicit questions based on item usage.
We develop a high-quality labeled training dataset using crowdsourcing.
We show that our approaches are effective in generating elicitation questions, even with limited training data.
arXiv Detail & Related papers (2021-11-26T12:23:14Z) - Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via
Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision.
Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.