Related papers: Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter

Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter

URL: http://arxiv.org/abs/2408.10573v1
Date: Tue, 20 Aug 2024 06:24:47 GMT
Title: Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter
Authors: Junhao Chen, Bowen Wang, Zhouqiang jiang, Yuta Nakashima,
Abstract summary: We introduce single-round instance-level prompt optimization, referred to as question rewriter. By enhancing the intelligibility of human questions for black-box LLMs, our question rewriter improves the quality of generated answers.
Score: 17.736962215696366
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have demonstrated significant capabilities, particularly in the domain of question answering (QA). However, their effectiveness in QA is often undermined by the vagueness of user questions. To address this issue, we introduce single-round instance-level prompt optimization, referred to as question rewriter. By enhancing the intelligibility of human questions for black-box LLMs, our question rewriter improves the quality of generated answers. The rewriter is optimized using direct preference optimization based on feedback collected from automatic criteria for evaluating generated answers; therefore, its training does not require costly human annotations. The experiments across multiple black-box LLMs and long-form question answering (LFQA) datasets demonstrate the efficacy of our method. This paper provides a practical framework for training question rewriters and sets a precedent for future explorations in prompt optimization within LFQA tasks. Code is available at \url{https://github.com/3244we/Question-Rewriter}.

Related papers

Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering [78.89231943329885]
One of the most widely used tasks to evaluate Large Language Models (LLMs) is Multiple-Choice Question Answering (MCQA) In this work, we shed light on the inconsistencies of MCQA evaluation strategies, which can lead to inaccurate and misleading model comparisons.
arXiv Detail & Related papers (2025-03-19T08:45:03Z)
LLMs Can Generate a Better Answer by Aggregating Their Own Responses [83.69632759174405]
Large Language Models (LLMs) have shown remarkable capabilities across tasks, yet they often require additional prompting techniques when facing complex problems. We argue this limitation stems from the fact that common LLM post-training procedures lack explicit supervision for discriminative judgment tasks. We propose Generative Self-Aggregation (GSA), a novel prompting method that improves answer quality without requiring the model's discriminative capabilities.
arXiv Detail & Related papers (2025-03-06T05:25:43Z)
Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores [16.434748534272014]
PlausibleQA is a dataset of 10,000 questions and 100,000 candidate answers annotated with plausibility scores and justifications. We show that plausibility-aware approaches are effective for Multiple-Choice Question Answering (MCQA) and QARA.
arXiv Detail & Related papers (2025-02-22T21:14:18Z)
Prompt Optimization with Human Feedback [69.95991134172282]
We study the problem of prompt optimization with human feedback (POHF) We introduce our algorithm named automated POHF (APOHF) The results demonstrate that our APOHF can efficiently find a good prompt using a small number of preference feedback instances.
arXiv Detail & Related papers (2024-05-27T16:49:29Z)
Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering [16.394601658945625]
Code Community Question Answering (CCQA) seeks to tackle programming-related issues, thereby boosting productivity in software engineering and academic research. Recent advancements in Reinforcement Learning from Human Feedback (RLHF) have transformed the fine-tuning process of Large Language Models (LLMs) to produce responses that closely mimic human behavior. We propose a novel framework called Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering (ALMupQA) to create user-focused responses.
arXiv Detail & Related papers (2024-05-27T14:21:31Z)
Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z)
S-EQA: Tackling Situational Queries in Embodied Question Answering [48.43453390717167]
We present and tackle the problem of Embodied Question Answering with Situational Queries (S-EQA) in a household environment. We first introduce a novel Prompt-Generate-Evaluate scheme that wraps around an LLM's output to create a dataset of unique situational queries and corresponding consensus object information. We report an improved accuracy of 15.31% while using queries framed from the generated object consensus for Visual Question Answering (VQA) over directly answering situational ones.
arXiv Detail & Related papers (2024-05-08T00:45:20Z)
Enhancing Answer Selection in Community Question Answering with Pre-trained and Large Language Models [0.9065034043031668]
We first propose the Question-Answer cross attention networks (QAN) with pre-trained models for answer selection. We then utilize large language model (LLM) to perform answer selection with knowledge augmentation. Experiments show that the QAN model state-of-the-art performance on two datasets, SemEval2015 and SemEval 2017.
arXiv Detail & Related papers (2023-11-29T10:24:50Z)
Rescue: Ranking LLM Responses with Partial Ordering to Improve Response Generation [28.89786334298637]
We develop a novel method to optimize LLMs using ranking metrics. Rather than a traditional full ordering, we advocate for a partial ordering. We test our system's improved response generation ability using benchmark datasets.
arXiv Detail & Related papers (2023-11-15T17:27:14Z)
Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts [22.669502403623166]
We present Reasoning Question Prompts for VQA tasks, which can further activate the potential of Large Language Models. We generate self-contained questions as reasoning question prompts via an unsupervised question edition module. Each reasoning question prompt clearly indicates the intent of the original question. Then, the candidate answers associated with their confidence scores acting as answer integritys are fed into LLMs.
arXiv Detail & Related papers (2023-11-15T15:40:46Z)
SQUARE: Automatic Question Answering Evaluation using Multiple Positive and Negative References [73.67707138779245]
We propose a new evaluation metric: SQuArE (Sentence-level QUestion AnsweRing Evaluation) We evaluate SQuArE on both sentence-level extractive (Answer Selection) and generative (GenQA) QA systems.
arXiv Detail & Related papers (2023-09-21T16:51:30Z)
Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z)
Reinforced Question Rewriting for Conversational Question Answering [25.555372505026526]
We develop a model to rewrite conversational questions into self-contained ones. It allows using existing single-turn QA systems to avoid training a CQA model from scratch. We propose using QA feedback to supervise the rewriting model with reinforcement learning.
arXiv Detail & Related papers (2022-10-27T21:23:36Z)
Read before Generate! Faithful Long Form Question Answering with Machine Reading [77.17898499652306]
Long-form question answering (LFQA) aims to generate a paragraph-length answer for a given question. We propose a new end-to-end framework that jointly models answer generation and machine reading.
arXiv Detail & Related papers (2022-03-01T10:41:17Z)
Exploring Question-Specific Rewards for Generating Deep Questions [42.243227323241584]
We design three different rewards that target to improve the fluency, relevance, and answerability of generated questions. We find that optimizing question-specific rewards generally leads to better performance in automatic evaluation metrics.
arXiv Detail & Related papers (2020-11-02T16:37:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.