Ensemble Kalman filter for uncertainty in human language comprehension
- URL: http://arxiv.org/abs/2505.02590v1
- Date: Mon, 05 May 2025 11:56:12 GMT
- Title: Ensemble Kalman filter for uncertainty in human language comprehension
- Authors: Diksha Bhandari, Alessandro Lopopolo, Milena Rabovsky, Sebastian Reich,
- Abstract summary: We propose a Bayesian framework for sentence comprehension, applying an extension of the ensemble Kalman filter (EnKF) for Bayesian inference to quantify uncertainty.<n>By framing language comprehension as a Bayesian inverse problem, this approach enhances the SG model's ability to reflect human sentence processing with respect to the representation of uncertainty.
- Score: 39.781091151259766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial neural networks (ANNs) are widely used in modeling sentence processing but often exhibit deterministic behavior, contrasting with human sentence comprehension, which manages uncertainty during ambiguous or unexpected inputs. This is exemplified by reversal anomalies-sentences with unexpected role reversals that challenge syntax and semantics-highlighting the limitations of traditional ANN models, such as the Sentence Gestalt (SG) Model. To address these limitations, we propose a Bayesian framework for sentence comprehension, applying an extension of the ensemble Kalman filter (EnKF) for Bayesian inference to quantify uncertainty. By framing language comprehension as a Bayesian inverse problem, this approach enhances the SG model's ability to reflect human sentence processing with respect to the representation of uncertainty. Numerical experiments and comparisons with maximum likelihood estimation (MLE) demonstrate that Bayesian methods improve uncertainty representation, enabling the model to better approximate human cognitive processing when dealing with linguistic ambiguities.
Related papers
- Enhancing Uncertainty Estimation and Interpretability via Bayesian Non-negative Decision Layer [55.66973223528494]
We develop a Bayesian Non-negative Decision Layer (BNDL), which reformulates deep neural networks as a conditional Bayesian non-negative factor analysis.<n>BNDL can model complex dependencies and provide robust uncertainty estimation.<n>We also offer theoretical guarantees that BNDL can achieve effective disentangled learning.
arXiv Detail & Related papers (2025-05-28T10:23:34Z) - A statistically consistent measure of semantic uncertainty using Language Models [3.4933610074113464]
We propose a novel measure of semantic uncertainty, semantic spectral entropy, that is statistically consistent under mild assumptions.<n>This measure is implemented through a straightforward algorithm that relies solely on standard, pretrained language models.
arXiv Detail & Related papers (2025-02-01T17:55:58Z) - On Subjective Uncertainty Quantification and Calibration in Natural Language Generation [2.622066970118316]
Large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging.
This work addresses these challenges from a perspective of Bayesian decision theory.
We discuss how this assumption enables principled quantification of the model's subjective uncertainty and its calibration.
The proposed methods can be applied to black-box language models.
arXiv Detail & Related papers (2024-06-07T18:54:40Z) - Tuning-Free Accountable Intervention for LLM Deployment -- A
Metacognitive Approach [55.613461060997004]
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks.
We propose an innovative textitmetacognitive approach, dubbed textbfCLEAR, to equip LLMs with capabilities for self-aware error identification and correction.
arXiv Detail & Related papers (2024-03-08T19:18:53Z) - Uncertainty Quantification for Forward and Inverse Problems of PDEs via
Latent Global Evolution [110.99891169486366]
We propose a method that integrates efficient and precise uncertainty quantification into a deep learning-based surrogate model.
Our method endows deep learning-based surrogate models with robust and efficient uncertainty quantification capabilities for both forward and inverse problems.
Our method excels at propagating uncertainty over extended auto-regressive rollouts, making it suitable for scenarios involving long-term predictions.
arXiv Detail & Related papers (2024-02-13T11:22:59Z) - Modeling Uncertainty in Personalized Emotion Prediction with Normalizing
Flows [6.32047610997385]
This work proposes a novel approach to capture the uncertainty of the forecast using conditional Normalizing Flows.
We validated our method on three challenging, subjective NLP tasks, including emotion recognition and hate speech.
The information brought by the developed methods makes it possible to build hybrid models whose effectiveness surpasses classic solutions.
arXiv Detail & Related papers (2023-12-10T23:21:41Z) - CUE: An Uncertainty Interpretation Framework for Text Classifiers Built
on Pre-Trained Language Models [28.750894873827068]
We propose a novel framework, called CUE, which aims to interpret uncertainties inherent in the predictions of PLM-based models.
By comparing the difference in predictive uncertainty between the perturbed and the original text representations, we are able to identify the latent dimensions responsible for uncertainty.
arXiv Detail & Related papers (2023-06-06T11:37:46Z) - Uncertainty-Aware Natural Language Inference with Stochastic Weight
Averaging [8.752563431501502]
This paper introduces Bayesian uncertainty modeling using Weight Averaging-Gaussian (SWAG) in Natural Language Understanding (NLU) tasks.
We demonstrate the effectiveness of the method in terms of prediction accuracy and correlation with human annotation disagreements.
arXiv Detail & Related papers (2023-04-10T17:37:23Z) - Interpretable Social Anchors for Human Trajectory Forecasting in Crowds [84.20437268671733]
We propose a neural network-based system to predict human trajectory in crowds.
We learn interpretable rule-based intents, and then utilise the expressibility of neural networks to model scene-specific residual.
Our architecture is tested on the interaction-centric benchmark TrajNet++.
arXiv Detail & Related papers (2021-05-07T09:22:34Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - Discrete Variational Attention Models for Language Generation [51.88612022940496]
We propose a discrete variational attention model with categorical distribution over the attention mechanism owing to the discrete nature in languages.
Thanks to the property of discreteness, the training of our proposed approach does not suffer from posterior collapse.
arXiv Detail & Related papers (2020-04-21T05:49:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.