Semisupervised Neural Proto-Language Reconstruction
- URL: http://arxiv.org/abs/2406.05930v2
- Date: Mon, 12 Aug 2024 15:10:00 GMT
- Title: Semisupervised Neural Proto-Language Reconstruction
- Authors: Liang Lu, Peirong Xie, David R. Mortensen,
- Abstract summary: We propose a semisupervised historical reconstruction task in which the model is trained on only a small amount of labeled data.
We show that this architecture is able to leverage unlabeled cognate sets to outperform strong semisupervised baselines on this novel task.
- Score: 11.105362395278142
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing work implementing comparative reconstruction of ancestral languages (proto-languages) has usually required full supervision. However, historical reconstruction models are only of practical value if they can be trained with a limited amount of labeled data. We propose a semisupervised historical reconstruction task in which the model is trained on only a small amount of labeled data (cognate sets with proto-forms) and a large amount of unlabeled data (cognate sets without proto-forms). We propose a neural architecture for comparative reconstruction (DPD-BiReconstructor) incorporating an essential insight from linguists' comparative method: that reconstructed words should not only be reconstructable from their daughter words, but also deterministically transformable back into their daughter words. We show that this architecture is able to leverage unlabeled cognate sets to outperform strong semisupervised baselines on this novel task.
Related papers
- Improved Neural Protoform Reconstruction via Reflex Prediction [11.105362395278142]
We argue that not only should protoforms be inferable from cognate sets (sets of related reflexes) but the reflexes should also be inferable from the protoforms.
We propose a system in which candidate protoforms from a reconstruction model are reranked by a reflex prediction model.
arXiv Detail & Related papers (2024-03-27T17:13:38Z) - Representing and Computing Uncertainty in Phonological Reconstruction [5.284425534494986]
Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms.
We present a new framework that allows for the representation of uncertainty in linguistic reconstruction and also includes a workflow for the computation of fuzzy reconstructions from linguistic data.
arXiv Detail & Related papers (2023-10-19T13:27:42Z) - Contextualising Implicit Representations for Semantic Tasks [5.453372578880444]
Prior works have demonstrated that implicit representations trained only for reconstruction tasks typically generate encodings not useful for semantic tasks.
We propose a method that contextualises the encodings of implicit representations, enabling their use in downstream tasks.
arXiv Detail & Related papers (2023-05-22T17:59:58Z) - Understanding Reconstruction Attacks with the Neural Tangent Kernel and
Dataset Distillation [110.61853418925219]
We build a stronger version of the dataset reconstruction attack and show how it can provably recover the emphentire training set in the infinite width regime.
We show that both theoretically and empirically, reconstructed images tend to "outliers" in the dataset.
These reconstruction attacks can be used for textitdataset distillation, that is, we can retrain on reconstructed images and obtain high predictive accuracy.
arXiv Detail & Related papers (2023-02-02T21:41:59Z) - Neural Unsupervised Reconstruction of Protolanguage Word Forms [34.66200889614538]
We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms.
We extend this work with neural models that can capture more complicated phonological and morphological changes.
arXiv Detail & Related papers (2022-11-16T05:38:51Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - Generative or Contrastive? Phrase Reconstruction for Better Sentence
Representation Learning [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning may yield powerful enough sentence representation and achieve performance in Sentence Textual Similarity tasks on par with contrastive learning.
arXiv Detail & Related papers (2022-04-20T10:00:46Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Exemplar-Controllable Paraphrasing and Translation using Bitext [57.92051459102902]
We adapt models from prior work to be able to learn solely from bilingual text (bitext)
Our single proposed model can perform four tasks: controlled paraphrase generation in both languages and controlled machine translation in both language directions.
arXiv Detail & Related papers (2020-10-12T17:02:50Z) - Do sequence-to-sequence VAEs learn global features of sentences? [13.43800646539014]
We study the Varienational Autoencoder (VAE) for natural language with the sequence-to-sequence architecture.
We find that VAEs are prone to memorizing the first words and the sentence length, producing local features of limited usefulness.
These variants learn latent variables that are more global, i.e., more predictive of topic or sentiment labels.
arXiv Detail & Related papers (2020-04-16T14:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.