Representing and Computing Uncertainty in Phonological Reconstruction
- URL: http://arxiv.org/abs/2310.12727v1
- Date: Thu, 19 Oct 2023 13:27:42 GMT
- Title: Representing and Computing Uncertainty in Phonological Reconstruction
- Authors: Johann-Mattis List, Nathan W. Hill, Robert Forkel, Frederic Blum
- Abstract summary: Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms.
We present a new framework that allows for the representation of uncertainty in linguistic reconstruction and also includes a workflow for the computation of fuzzy reconstructions from linguistic data.
- Score: 5.284425534494986
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the inherently fuzzy nature of reconstructions in historical
linguistics, most scholars do not represent their uncertainty when proposing
proto-forms. With the increasing success of recently proposed approaches to
automating certain aspects of the traditional comparative method, the formal
representation of proto-forms has also improved. This formalization makes it
possible to address both the representation and the computation of uncertainty.
Building on recent advances in supervised phonological reconstruction, during
which an algorithm learns how to reconstruct words in a given proto-language
relying on previously annotated data, and inspired by improved methods for
automated word prediction from cognate sets, we present a new framework that
allows for the representation of uncertainty in linguistic reconstruction and
also includes a workflow for the computation of fuzzy reconstructions from
linguistic data.
Related papers
- Improved Neural Protoform Reconstruction via Reflex Prediction [11.105362395278142]
We argue that not only should protoforms be inferable from cognate sets (sets of related reflexes) but the reflexes should also be inferable from the protoforms.
We propose a system in which candidate protoforms from a reconstruction model are reranked by a reflex prediction model.
arXiv Detail & Related papers (2024-03-27T17:13:38Z) - On Robustness of Prompt-based Semantic Parsing with Large Pre-trained
Language Model: An Empirical Study on Codex [48.588772371355816]
This paper presents the first empirical study on the adversarial robustness of a large prompt-based language model of code, codex.
Our results demonstrate that the state-of-the-art (SOTA) code-language models are vulnerable to carefully crafted adversarial examples.
arXiv Detail & Related papers (2023-01-30T13:21:00Z) - Reconstruction Probing [7.647452554776166]
We propose a new analysis method for contextualized representations based on reconstruction probabilities in masked language models.
We find that contextualization boostsability of tokens close to the token being reconstructed in terms of linear and syntactic distance.
We extend our analysis to finer decomposition of contextualized representations, and we find that these boosts are largely attributable to static and positional embeddings at the input layer.
arXiv Detail & Related papers (2022-12-21T06:22:03Z) - Neural Unsupervised Reconstruction of Protolanguage Word Forms [34.66200889614538]
We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms.
We extend this work with neural models that can capture more complicated phonological and morphological changes.
arXiv Detail & Related papers (2022-11-16T05:38:51Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Bayesian Recurrent Units and the Forward-Backward Algorithm [91.39701446828144]
Using Bayes's theorem, we derive a unit-wise recurrence as well as a backward recursion similar to the forward-backward algorithm.
The resulting Bayesian recurrent units can be integrated as recurrent neural networks within deep learning frameworks.
Experiments on speech recognition indicate that adding the derived units at the end of state-of-the-art recurrent architectures can improve the performance at a very low cost in terms of trainable parameters.
arXiv Detail & Related papers (2022-07-21T14:00:52Z) - Generative or Contrastive? Phrase Reconstruction for Better Sentence
Representation Learning [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning may yield powerful enough sentence representation and achieve performance in Sentence Textual Similarity tasks on par with contrastive learning.
arXiv Detail & Related papers (2022-04-20T10:00:46Z) - A New Framework for Fast Automated Phonological Reconstruction Using
Trimmed Alignments and Sound Correspondence Patterns [2.6212127510234797]
We present a new framework that combines state-of-the-art techniques for automated sequence comparison with novel techniques for phonetic alignment analysis and sound correspondence pattern detection.
Our method yields promising results while at the same time being not only fast but also easy to apply and expand.
arXiv Detail & Related papers (2022-04-10T07:11:19Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Towards a Theoretical Understanding of the Robustness of Variational
Autoencoders [82.68133908421792]
We make inroads into understanding the robustness of Variational Autoencoders (VAEs) to adversarial attacks and other input perturbations.
We develop a novel criterion for robustness in probabilistic models: $r$-robustness.
We show that VAEs trained using disentangling methods score well under our robustness metrics.
arXiv Detail & Related papers (2020-07-14T21:22:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.