Ask Again, Then Fail: Large Language Models' Vacillations in Judgment
- URL: http://arxiv.org/abs/2310.02174v5
- Date: Tue, 11 Jun 2024 15:22:07 GMT
- Title: Ask Again, Then Fail: Large Language Models' Vacillations in Judgment
- Authors: Qiming Xie, Zengzhi Wang, Yi Feng, Rui Xia,
- Abstract summary: We observe that current conversational language models often waver in their judgments when faced with follow-up questions.
We introduce a textscFollow-up Questioning Mechanism along with two metrics to quantify this inconsistency.
We develop a training-based framework textscUnwavering-FQ that teaches language models to maintain their originally correct judgments.
- Score: 28.74246375289661
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We observe that current conversational language models often waver in their judgments when faced with follow-up questions, even if the original judgment was correct. This wavering presents a significant challenge for generating reliable responses and building user trust. To comprehensively assess this issue, we introduce a \textsc{Follow-up Questioning Mechanism} along with two metrics to quantify this inconsistency, confirming its widespread presence in current language models. To mitigate this issue, we explore various prompting strategies for closed-source models; moreover, we develop a training-based framework \textsc{Unwavering-FQ} that teaches language models to maintain their originally correct judgments through synthesized high-quality preference data. Our experimental results confirm the effectiveness of our framework and its ability to enhance the general capabilities of models.
Related papers
- Preference Learning for AI Alignment: a Causal Perspective [55.2480439325792]
We frame this problem in a causal paradigm, providing the rich toolbox of causality to identify persistent challenges.<n>Inheriting from the literature of causal inference, we identify key assumptions necessary for reliable generalisation.<n>We illustrate failure modes of naive reward models and demonstrate how causally-inspired approaches can improve model robustness.
arXiv Detail & Related papers (2025-06-06T10:45:42Z) - LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation [1.2576388595811496]
We introduce a framework for producing linguistic reasoning problems that reduces the effect of memorisation in model performance estimates.
We apply this framework to develop LINGOLY-TOO, a challenging benchmark for linguistic reasoning.
arXiv Detail & Related papers (2025-03-04T19:57:47Z) - Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that the pointwise mutual information between a context and a question is an effective gauge for language model performance.
We propose two methods that use the pointwise mutual information between a document and a question as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation [12.921225188504643]
We propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for the quality of paired responses.
Empirical results demonstrate significant benefits of incorporating the proposed proxy into language model training.
arXiv Detail & Related papers (2024-05-10T12:14:11Z) - Evidence from counterfactual tasks supports emergent analogical reasoning in large language models [3.9189409002585562]
We report evidence that large language models are capable of solving a wide range of text-based analogy problems in a zero-shot manner.
Two recent commentaries have challenged these results, citing evidence from so-called counterfactual' tasks in which the standard sequence of the alphabet is arbitrarily permuted.
Here, we reply to these critiques, clarifying some misunderstandings about the test materials used in our original work, and presenting evidence that language models are also capable of generalizing to these new counterfactual task variants.
arXiv Detail & Related papers (2024-04-14T21:51:02Z) - Calibrating the Confidence of Large Language Models by Eliciting Fidelity [52.47397325111864]
Large language models optimized with techniques like RLHF have achieved good alignment in being helpful and harmless.
Post-alignment, these language models often exhibit overconfidence, where the expressed confidence does not accurately calibrate with their correctness rate.
We propose a plug-and-play method to estimate the confidence of language models.
arXiv Detail & Related papers (2024-04-03T11:36:12Z) - Fine-tuning Language Models for Factuality [96.5203774943198]
Large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines.
Yet language models are prone to making convincing but factually inaccurate claims, often referred to as 'hallucinations'
In this work, we fine-tune language models to be more factual, without human labeling.
arXiv Detail & Related papers (2023-11-14T18:59:15Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Knowledge-Based Counterfactual Queries for Visual Question Answering [0.0]
We propose a systematic method for explaining the behavior and investigating the robustness of VQA models through counterfactual perturbations.
For this reason, we exploit structured knowledge bases to perform deterministic, optimal and controllable word-level replacements targeting the linguistic modality.
We then evaluate the model's response against such counterfactual inputs.
arXiv Detail & Related papers (2023-03-05T08:00:30Z) - Realistic Conversational Question Answering with Answer Selection based
on Calibrated Confidence and Uncertainty Measurement [54.55643652781891]
Conversational Question Answering (ConvQA) models aim at answering a question with its relevant paragraph and previous question-answer pairs that occurred during conversation multiple times.
We propose to filter out inaccurate answers in the conversation history based on their estimated confidences and uncertainties from the ConvQA model.
We validate our models, Answer Selection-based realistic Conversation Question Answering, on two standard ConvQA datasets.
arXiv Detail & Related papers (2023-02-10T09:42:07Z) - Towards Improving Faithfulness in Abstractive Summarization [37.19777407790153]
We propose a Faithfulness Enhanced Summarization model (FES) to improve fidelity in abstractive summarization.
Our model outperforms strong baselines in experiments on CNN/DM and XSum.
arXiv Detail & Related papers (2022-10-04T19:52:09Z) - Learning from Lexical Perturbations for Consistent Visual Question
Answering [78.21912474223926]
Existing Visual Question Answering (VQA) models are often fragile and sensitive to input variations.
We propose a novel approach to address this issue based on modular networks, which creates two questions related by linguistic perturbations.
We also present VQA Perturbed Pairings (VQA P2), a new, low-cost benchmark and augmentation pipeline to create controllable linguistic variations.
arXiv Detail & Related papers (2020-11-26T17:38:03Z) - Knowledge-Grounded Dialogue Generation with Pre-trained Language Models [74.09352261943911]
We study knowledge-grounded dialogue generation with pre-trained language models.
We propose equipping response generation defined by a pre-trained language model with a knowledge selection module.
arXiv Detail & Related papers (2020-10-17T16:49:43Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.