Related papers: Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention

Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention

URL: http://arxiv.org/abs/2405.16042v1
Date: Sat, 25 May 2024 03:36:13 GMT
Title: Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention
Authors: Andrew Li, Xianle Feng, Siddhant Narang, Austin Peng, Tianle Cai, Raj Sanjay Shah, Sashank Varma,
Abstract summary: We investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models. The overall goal is to evaluate whether humans and LLMs are aligned in their processing of garden-path sentences. Experiments show promising alignment between humans and LLMs in the processing of garden-path sentences.
Score: 11.073959609358088
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: When reading temporarily ambiguous garden-path sentences, misinterpretations sometimes linger past the point of disambiguation. This phenomenon has traditionally been studied in psycholinguistic experiments using online measures such as reading times and offline measures such as comprehension questions. Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa. The overall goal is to evaluate whether humans and LLMs are aligned in their processing of garden-path sentences and in the lingering misinterpretations past the point of disambiguation, especially when extra-syntactic information (e.g., a comma delimiting a clause boundary) is present to guide processing. We address this goal using 24 garden-path sentences that have optional transitive and reflexive verbs leading to temporary ambiguities. For each sentence, there are a pair of comprehension questions corresponding to the misinterpretation and the correct interpretation. In three experiments, we (1) measure the dynamic semantic interpretations of LLMs using the question-answering task; (2) track whether these models shift their implicit parse tree at the point of disambiguation (or by the end of the sentence); and (3) visualize the model components that attend to disambiguating information when processing the question probes. These experiments show promising alignment between humans and LLMs in the processing of garden-path sentences, especially when extra-syntactic information is available to guide processing.

Related papers

SlangDIT: Benchmarking LLMs in Interpretative Slang Translation [89.48208612476068]
This paper introduces the interpretative slang translation task (named SlangDIT)<n>It consists of three sub-tasks: slang detection, cross-lingual slang explanation, and slang translation within the current context.<n>Based on the benchmark, we propose a deep thinking model, named SlangOWL. It firstly identifies whether the sentence contains a slang, and then judges whether the slang is polysemous and analyze its possible meaning.
arXiv Detail & Related papers (2025-05-20T10:37:34Z)
Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering [68.3400058037817]
We introduce TREQA (Translation Evaluation via Question-Answering), a framework that extrinsically evaluates translation quality. We show that TREQA is competitive with and, in some cases, outperforms state-of-the-art neural and LLM-based metrics in ranking alternative paragraph-level translations.
arXiv Detail & Related papers (2025-04-10T09:24:54Z)
When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models [41.929897900569905]
Modern Large Language Models (LLMs) have shown human-like abilities in many language tasks. We compare the two on a sentence comprehension task using garden-path constructions. Our findings reveal that both LLMs and humans struggle with specific syntactic complexities.
arXiv Detail & Related papers (2025-02-13T13:19:33Z)
Underutilization of Syntactic Processing by Chinese Learners of English in Comprehending English Sentences, Evidenced from Adapted Garden-Path Ambiguity Experiment [0.0]
This study highlights the under-utilization of syntactic processing, from a syntactic perspective. The study identifies two types of parsing under-utilization: partial and complete. It lays a foundation for the development of a novel parsing method designed to fully integrate syntactic processing into sentence comprehension.
arXiv Detail & Related papers (2024-12-21T01:32:10Z)
A Multi-Task Text Classification Pipeline with Natural Language Explanations: A User-Centric Evaluation in Sentiment Analysis and Offensive Language Identification in Greek Tweets [8.846643533783205]
This work introduces an early concept for a novel pipeline that can be used in text classification tasks. It comprises of two models: a classifier for labelling the text and an explanation generator which provides the explanation. Experiments are centred around the tasks of sentiment analysis and offensive language identification in Greek tweets.
arXiv Detail & Related papers (2024-10-14T08:41:31Z)
Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism [62.571419297164645]
This paper provides a systematic overview of prior works on the logical reasoning ability of large language models for analyzing categorical syllogisms. We first investigate all the possible variations for the categorical syllogisms from a purely logical perspective. We then examine the underlying configurations (i.e., mood and figure) tested by the existing datasets.
arXiv Detail & Related papers (2024-06-26T21:17:20Z)
Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z)
Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST! [4.1970767174840455]
We study whether pre-trained language models (LMs) correctly identify and interpret underspecified sentences. Our experiments show that when interpreting underspecified sentences, LMs exhibit little uncertainty, contrary to what theoretical accounts of underspecification would predict.
arXiv Detail & Related papers (2024-02-19T19:49:29Z)
Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs [58.620269228776294]
We propose a task-agnostic framework for resolving ambiguity by asking users clarifying questions. We evaluate systems across three NLP applications: question answering, machine translation and natural language inference. We find that intent-sim is robust, demonstrating improvements across a wide range of NLP tasks and LMs.
arXiv Detail & Related papers (2023-11-16T00:18:50Z)
Are Representations Built from the Ground Up? An Empirical Examination of Local Composition in Language Models [91.3755431537592]
Representing compositional and non-compositional phrases is critical for language understanding. We first formulate a problem of predicting the LM-internal representations of longer phrases given those of their constituents. While we would expect the predictive accuracy to correlate with human judgments of semantic compositionality, we find this is largely not the case.
arXiv Detail & Related papers (2022-10-07T14:21:30Z)
The Language Model Understood the Prompt was Ambiguous: Probing Syntactic Uncertainty Through Generation [23.711953448400514]
We inspect to which extent neural language models (LMs) exhibit uncertainty over such analyses. We find that LMs can track multiple analyses simultaneously. As a response to disambiguating cues, the LMs often select the correct interpretation, but occasional errors point to potential areas of improvement.
arXiv Detail & Related papers (2021-09-16T10:27:05Z)
Perturbing Inputs for Fragile Interpretations in Deep Natural Language Processing [18.91129968022831]
Interpretability methods need to be robust for trustworthy NLP applications in high-stake areas like medicine or finance. Our paper demonstrates how interpretations can be manipulated by making simple word perturbations on an input text.
arXiv Detail & Related papers (2021-08-11T02:07:21Z)
Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding. Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z)
Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation [71.70562795158625]
Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU) Recent advent of end-to-end neural models, self-supervised via language modeling (LM), and their success on a wide range of LU tasks, questions this belief. We empirically investigate the usefulness of supervised parsing for semantic LU in the context of LM-pretrained transformer networks.
arXiv Detail & Related papers (2020-08-15T21:03:36Z)
SLAM-Inspired Simultaneous Contextualization and Interpreting for Incremental Conversation Sentences [0.0]
We propose a method to dynamically estimate the context and interpretations of polysemous words in sequential sentences. By using the SCAIN algorithm, we can sequentially optimize the interdependence between context and word interpretation while obtaining new interpretations online.
arXiv Detail & Related papers (2020-05-29T16:40:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.