A New Framework for Fast Automated Phonological Reconstruction Using
Trimmed Alignments and Sound Correspondence Patterns
- URL: http://arxiv.org/abs/2204.04619v1
- Date: Sun, 10 Apr 2022 07:11:19 GMT
- Title: A New Framework for Fast Automated Phonological Reconstruction Using
Trimmed Alignments and Sound Correspondence Patterns
- Authors: Johann-Mattis List, Robert Forkel, Nathan W. Hill
- Abstract summary: We present a new framework that combines state-of-the-art techniques for automated sequence comparison with novel techniques for phonetic alignment analysis and sound correspondence pattern detection.
Our method yields promising results while at the same time being not only fast but also easy to apply and expand.
- Score: 2.6212127510234797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational approaches in historical linguistics have been increasingly
applied during the past decade and many new methods that implement parts of the
traditional comparative method have been proposed. Despite these increased
efforts, there are not many easy-to-use and fast approaches for the task of
phonological reconstruction. Here we present a new framework that combines
state-of-the-art techniques for automated sequence comparison with novel
techniques for phonetic alignment analysis and sound correspondence pattern
detection to allow for the supervised reconstruction of word forms in ancestral
languages. We test the method on a new dataset covering six groups from three
different language families. The results show that our method yields promising
results while at the same time being not only fast but also easy to apply and
expand.
Related papers
- Representing and Computing Uncertainty in Phonological Reconstruction [5.284425534494986]
Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms.
We present a new framework that allows for the representation of uncertainty in linguistic reconstruction and also includes a workflow for the computation of fuzzy reconstructions from linguistic data.
arXiv Detail & Related papers (2023-10-19T13:27:42Z) - Trimming Phonetic Alignments Improves the Inference of Sound
Correspondence Patterns from Multilingual Wordlists [3.096615629099617]
Methods for the automatic inference of correspondence patterns from phonetically aligned cognate sets have been proposed.
Since annotation is tedious and time consuming, it would be desirable to find ways to improve aligned cognate data automatically.
We propose a workflow that trims phonetic alignments in comparative linguistics prior to the inference of correspondence patterns.
arXiv Detail & Related papers (2023-03-31T09:55:48Z) - Neural Unsupervised Reconstruction of Protolanguage Word Forms [34.66200889614538]
We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms.
We extend this work with neural models that can capture more complicated phonological and morphological changes.
arXiv Detail & Related papers (2022-11-16T05:38:51Z) - Unsupervised Lexical Substitution with Decontextualised Embeddings [48.00929769805882]
We propose a new unsupervised method for lexical substitution using pre-trained language models.
Our method retrieves substitutes based on the similarity of contextualised and decontextualised word embeddings.
We conduct experiments in English and Italian, and show that our method substantially outperforms strong baselines.
arXiv Detail & Related papers (2022-09-17T03:51:47Z) - Extract, Integrate, Compete: Towards Verification Style Reading
Comprehension [66.2551168928688]
We present a new verification style reading comprehension dataset named VGaokao from Chinese Language tests of Gaokao.
To address the challenges in VGaokao, we propose a novel Extract-Integrate-Compete approach.
arXiv Detail & Related papers (2021-09-11T01:34:59Z) - TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval [103.85002875155551]
We propose a novel generalized distillation method, TeachText, for exploiting large-scale language pretraining.
We extend our method to video side modalities and show that we can effectively reduce the number of used modalities at test time.
Our approach advances the state of the art on several video retrieval benchmarks by a significant margin and adds no computational overhead at test time.
arXiv Detail & Related papers (2021-04-16T17:55:28Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - Automated and Formal Synthesis of Neural Barrier Certificates for
Dynamical Models [70.70479436076238]
We introduce an automated, formal, counterexample-based approach to synthesise Barrier Certificates (BC)
The approach is underpinned by an inductive framework, which manipulates a candidate BC structured as a neural network, and a sound verifier, which either certifies the candidate's validity or generates counter-examples.
The outcomes show that we can synthesise sound BCs up to two orders of magnitude faster, with in particular a stark speedup on the verification engine.
arXiv Detail & Related papers (2020-07-07T07:39:42Z) - Improving Adversarial Text Generation by Modeling the Distant Future [155.83051741029732]
We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues.
We propose a novel guider network to focus on the generative process over a longer horizon, which can assist next-word prediction and provide intermediate rewards for generator optimization.
arXiv Detail & Related papers (2020-05-04T05:45:13Z) - Stateful Premise Selection by Recurrent Neural Networks [0.7614628596146599]
We develop a new learning-based method for selecting facts (premises) when proving new goals over large formal libraries.
Our stateful architecture is based on recurrent neural networks which have been recently very successful in stateful tasks such as language translation.
arXiv Detail & Related papers (2020-03-11T14:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.