Related papers: Reasoning about Ambiguous Definite Descriptions

Reasoning about Ambiguous Definite Descriptions

URL: http://arxiv.org/abs/2310.14657v1
Date: Mon, 23 Oct 2023 07:52:38 GMT
Title: Reasoning about Ambiguous Definite Descriptions
Authors: Stefan F. Schouten, Peter Bloem, Ilia Markov, Piek Vossen
Abstract summary: Natural language reasoning plays an important role in improving language models' ability to solve complex language understanding tasks. No resources exist to evaluate how well Large Language Models can use explicit reasoning to resolve ambiguity in language. We propose to use ambiguous definite descriptions for this purpose and create and publish the first benchmark dataset consisting of such phrases.
Score: 2.5398014196797605
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Natural language reasoning plays an increasingly important role in improving language models' ability to solve complex language understanding tasks. An interesting use case for reasoning is the resolution of context-dependent ambiguity. But no resources exist to evaluate how well Large Language Models can use explicit reasoning to resolve ambiguity in language. We propose to use ambiguous definite descriptions for this purpose and create and publish the first benchmark dataset consisting of such phrases. Our method includes all information required to resolve the ambiguity in the prompt, which means a model does not require anything but reasoning to do well. We find this to be a challenging task for recent LLMs. Code and data available at: https://github.com/sfschouten/exploiting-ambiguity

Related papers

Integrating Symbolic Natural Language Understanding and Language Models for Word Sense Disambiguation [0.7967000209136494]
We propose a method that uses statistical language models as oracles for disambiguation.<n>Multiple candidate meanings generated by a symbolic NLU system are converted into distinguishable natural language alternatives.<n>We evaluate our method against human-annotated gold answers to demonstrate its effectiveness.
arXiv Detail & Related papers (2025-11-20T17:32:15Z)
Disambiguate First, Parse Later: Generating Interpretations for Ambiguity Resolution in Semantic Parsing [56.82807063333088]
We propose a modular approach that resolves ambiguity using natural language interpretations before mapping these to logical forms.<n>Our approach improves interpretation coverage and generalizes across datasets with different annotation styles, database structures, and ambiguity types.
arXiv Detail & Related papers (2025-02-25T18:42:26Z)
Making LLMs Reason? The Intermediate Language Problem in Neurosymbolic Approaches [49.567092222782435]
We introduce the intermediate language problem, which is the problem of choosing a suitable formal language representation for neurosymbolic approaches. We show a maximum difference in overall-accuracy of 53.20% and 49.26% in execution-accuracy. When using the GPT4o-mini LLM we beat the state-of-the-art in overall-accuracy on the ProntoQA dataset by 21.20% and by 50.50% on the ProofWriter dataset.
arXiv Detail & Related papers (2025-02-24T14:49:52Z)
Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts. We find that Llama Instruct and Mistral models exhibit high degrees of language confusion. We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z)
Zero and Few-shot Semantic Parsing with Ambiguous Inputs [45.285508941560295]
We introduce AmP, a framework, dataset, and challenge for translating ambiguous natural language to formal representations like logic and code. Using AmP, we investigate how several few-shot text-to-code systems handle ambiguity, introducing three new metrics. We find that large pre-trained models perform poorly at capturing the distribution of possible meanings without deliberate instruction.
arXiv Detail & Related papers (2023-06-01T15:46:36Z)
We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding. We characterize ambiguity in a sentence by its effect on entailment relations with another sentence. We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z)
Large Language Models Can Be Easily Distracted by Irrelevant Context [29.315230178997002]
We investigate how the model problem-solving accuracy can be influenced by irrelevant context. We use benchmark to measure the distractibility of cutting-edge prompting techniques for large language models.
arXiv Detail & Related papers (2023-01-31T20:48:57Z)
Python Code Generation by Asking Clarification Questions [57.63906360576212]
In this work, we introduce a novel and more realistic setup for this task. We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions. We collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers.
arXiv Detail & Related papers (2022-12-19T22:08:36Z)
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning [73.3035118224719]
We propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. APOLLO performs comparably on ReClor and outperforms baselines on LogiQA.
arXiv Detail & Related papers (2022-12-19T07:40:02Z)
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs [26.118193748582197]
We evaluate four categories of widely used state-of-the-art models. We find that, despite only evaluating on utterances that require a binary inference, models in three of these categories perform close to random. These results suggest that certain fine-tuning strategies are far better at inducing pragmatic understanding in models.
arXiv Detail & Related papers (2022-10-26T19:04:23Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation if all expressions in the language are referentially transparent. However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.