Automating Sound Change Prediction for Phylogenetic Inference: A
Tukanoan Case Study
- URL: http://arxiv.org/abs/2402.01582v1
- Date: Fri, 2 Feb 2024 17:20:16 GMT
- Title: Automating Sound Change Prediction for Phylogenetic Inference: A
Tukanoan Case Study
- Authors: Kalvin Chang, Nathaniel R. Robinson, Anna Cai, Ting Chen, Annie Zhang,
David R. Mortensen
- Abstract summary: We train a neural network on sound change data to predict intermediate sound change steps between historical protoforms and their modern descendants.
In our best experiments on Tukanoan languages, this method produces trees with a Generalized Quartet Distance of 0.12 from a tree that used expert annotations.
- Score: 12.78027959820939
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We describe a set of new methods to partially automate linguistic
phylogenetic inference given (1) cognate sets with their respective protoforms
and sound laws, (2) a mapping from phones to their articulatory features and
(3) a typological database of sound changes. We train a neural network on these
sound change data to weight articulatory distances between phones and predict
intermediate sound change steps between historical protoforms and their modern
descendants, replacing a linguistic expert in part of a parsimony-based
phylogenetic inference algorithm. In our best experiments on Tukanoan
languages, this method produces trees with a Generalized Quartet Distance of
0.12 from a tree that used expert annotations, a significant improvement over
other semi-automated baselines. We discuss potential benefits and drawbacks to
our neural approach and parsimony-based tree prediction. We also experiment
with a minimal generalization learner for automatic sound law induction,
finding it comparably effective to sound laws from expert annotation. Our code
is publicly available at https://github.com/cmu-llab/aiscp.
Related papers
- Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT [81.99600765234285]
We propose an end-to-end framework to predict the pronunciation of a polyphonic character.
The proposed method consists of a pre-trained bidirectional encoder representations from Transformers (BERT) model and a neural network (NN) based classifier.
arXiv Detail & Related papers (2025-01-02T06:51:52Z) - Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning [69.8008228833895]
We propose a small-sized generative neural network equipped with a continual learning mechanism.
Our model prioritizes interpretability and demonstrates the advantages of online learning.
arXiv Detail & Related papers (2024-12-23T10:23:47Z) - kNN For Whisper And Its Effect On Bias And Speaker Adaptation [10.174848090916669]
Token-level $k$ nearest neighbor search ($k$NN) is a non-parametric method that instead adapts using inference-time search in an external datastore.
We show that Whisper, a transformer end-to-end speech model, benefits from $k$NN.
arXiv Detail & Related papers (2024-10-24T15:32:52Z) - Human-like Linguistic Biases in Neural Speech Models: Phonetic Categorization and Phonotactic Constraints in Wav2Vec2.0 [0.11510009152620666]
We study how Wav2Vec2 resolves phonotactic constraints.
We synthesize sounds on an acoustic continuum between /l/ and /r/ and embed them in controlled contexts.
Like humans, Wav2Vec2 models show a bias towards the phonotactically admissable category in processing such ambiguous sounds.
arXiv Detail & Related papers (2024-07-03T11:04:31Z) - A unified one-shot prosody and speaker conversion system with
self-supervised discrete speech units [94.64927912924087]
Existing systems ignore the correlation between prosody and language content, leading to degradation of naturalness in converted speech.
We devise a cascaded modular system leveraging self-supervised discrete speech units as language representation.
Experiments show that our system outperforms previous approaches in naturalness, intelligibility, speaker transferability, and prosody transferability.
arXiv Detail & Related papers (2022-11-12T00:54:09Z) - Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo
Languages [58.43299730989809]
We introduce Wav2Seq, the first self-supervised approach to pre-train both parts of encoder-decoder models for speech data.
We induce a pseudo language as a compact discrete representation, and formulate a self-supervised pseudo speech recognition task.
This process stands on its own, or can be applied as low-cost second-stage pre-training.
arXiv Detail & Related papers (2022-05-02T17:59:02Z) - Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis [68.76620947298595]
Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text.
We propose a model that generates speech explicitly conditioned on the three primary acoustic correlates of prosody.
arXiv Detail & Related papers (2021-06-15T18:03:48Z) - Deep Sound Change: Deep and Iterative Learning, Convolutional Neural
Networks, and Language Change [0.0]
This paper proposes a framework for modeling sound change that combines deep learning and iterative learning.
It argues that several properties of sound change emerge from the proposed architecture.
arXiv Detail & Related papers (2020-11-10T23:49:09Z) - Syllabification of the Divine Comedy [0.0]
We provide a syllabification algorithm for the Divine Comedy using techniques from probabilistic and constraint programming.
We particularly focus on the synalephe, addressed in terms of the "propensity" of a word to take part in a synalephe with adjacent words.
We jointly provide an online vocabulary containing, for each word, information about its syllabification, the location of the tonic accent, and the aforementioned synalephe propensity.
arXiv Detail & Related papers (2020-10-26T12:14:14Z) - One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech [3.42658286826597]
We introduce an approach to multilingual speech synthesis which uses the meta-learning concept of contextual parameter generation.
Our model is shown to effectively share information across languages and according to a subjective evaluation test, it produces more natural and accurate code-switching speech than the baselines.
arXiv Detail & Related papers (2020-08-03T10:43:30Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.