No Stupid Questions: An Analysis of Question Query Generation for Citation Recommendation
- URL: http://arxiv.org/abs/2506.08196v1
- Date: Mon, 09 Jun 2025 20:13:32 GMT
- Title: No Stupid Questions: An Analysis of Question Query Generation for Citation Recommendation
- Authors: Brian D. Zimmerman, Julien Aubert-Béduchaud, Florian Boudin, Akiko Aizawa, Olga Vechtomova,
- Abstract summary: GPT-4o-mini asks questions which, when answered, could expose new insights about an excerpt from a scientific article.<n>We evaluate the utility of these questions as retrieval queries, measuring their effectiveness in retrieving and ranking masked target documents.
- Score: 29.419731388642393
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing techniques for citation recommendation are constrained by their adherence to article contents and metadata. We leverage GPT-4o-mini's latent expertise as an inquisitive assistant by instructing it to ask questions which, when answered, could expose new insights about an excerpt from a scientific article. We evaluate the utility of these questions as retrieval queries, measuring their effectiveness in retrieving and ranking masked target documents. In some cases, generated questions ended up being better queries than extractive keyword queries generated by the same model. We additionally propose MMR-RBO, a variation of Maximal Marginal Relevance (MMR) using Rank-Biased Overlap (RBO) to identify which questions will perform competitively with the keyword baseline. As all question queries yield unique result sets, we contend that there are no stupid questions.
Related papers
- MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs [15.278241998033822]
Open-ended question answering (QA) is a key task for evaluating the capabilities of large language models (LLMs)<n>We propose textbfMinosEval, a novel evaluation method that first distinguishes open-ended questions and then ranks candidate answers.
arXiv Detail & Related papers (2025-06-18T07:49:13Z) - I Could've Asked That: Reformulating Unanswerable Questions [89.93173151422636]
We evaluate open-source and proprietary models for reformulating unanswerable questions.
GPT-4 and Llama2-7B successfully reformulate questions only 26% and 12% of the time, respectively.
We publicly release the benchmark and the code to reproduce the experiments.
arXiv Detail & Related papers (2024-07-24T17:59:07Z) - Auto FAQ Generation [0.0]
We propose a system for generating FAQ documents that extract the salient questions and their corresponding answers from sizeable text documents.
We use existing text summarization, sentence ranking via the Text rank algorithm, and question-generation tools to create an initial set of questions and answers.
arXiv Detail & Related papers (2024-05-13T03:30:27Z) - CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval [52.134133938779776]
We present CLARINET, a system that asks informative clarification questions by choosing questions whose answers would maximize certainty in the correct candidate.
Our approach works by augmenting a large language model (LLM) to condition on a retrieval distribution, finetuning end-to-end to generate the question that would have maximized the rank of the true candidate at each turn.
arXiv Detail & Related papers (2024-04-28T18:21:31Z) - Researchy Questions: A Dataset of Multi-Perspective, Decompositional
Questions for LLM Web Agents [22.023543164141504]
We present Researchy Questions, a dataset of search engine queries tediously filtered to be non-factoid, decompositional'' and multi-perspective.
We show that users spend a lot of effort'' on these questions in terms of signals like clicks and session length.
We also show that slow thinking'' answering techniques, like decomposition into sub-questions shows benefit over answering directly.
arXiv Detail & Related papers (2024-02-27T21:27:16Z) - Answering Ambiguous Questions with a Database of Questions, Answers, and
Revisions [95.92276099234344]
We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia.
Our method improves performance by 15% on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs.
arXiv Detail & Related papers (2023-08-16T20:23:16Z) - Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES.
Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query.
By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z) - RQUGE: Reference-Free Metric for Evaluating Question Generation by
Answering the Question [29.18544401904503]
We propose a new metric, RQUGE, based on the answerability of the candidate question given the context.
We demonstrate that RQUGE has a higher correlation with human judgment without relying on the reference question.
arXiv Detail & Related papers (2022-11-02T21:10:09Z) - GooAQ: Open Question Answering with Diverse Answer Types [63.06454855313667]
We present GooAQ, a large-scale dataset with a variety of answer types.
This dataset contains over 5 million questions and 3 million answers collected from Google.
arXiv Detail & Related papers (2021-04-18T05:40:39Z) - MS-Ranker: Accumulating Evidence from Potentially Correct Candidates for
Answer Selection [59.95429407899612]
We propose a novel reinforcement learning based multi-step ranking model, named MS-Ranker.
We explicitly consider the potential correctness of candidates and update the evidence with a gating mechanism.
Our model significantly outperforms existing methods that do not rely on external resources.
arXiv Detail & Related papers (2020-10-10T10:36:58Z) - Break It Down: A Question Understanding Benchmark [79.41678884521801]
We introduce a Question Decomposition Representation Meaning (QDMR) for questions.
QDMR constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question.
We release the Break dataset, containing over 83K pairs of questions and their QDMRs.
arXiv Detail & Related papers (2020-01-31T11:04:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.