Computer-Aided Modelling of the Bilingual Word Indices to the
Ninth-Century Uchitel'noe evangelie
- URL: http://arxiv.org/abs/2211.05579v1
- Date: Tue, 25 Oct 2022 10:16:39 GMT
- Title: Computer-Aided Modelling of the Bilingual Word Indices to the
Ninth-Century Uchitel'noe evangelie
- Authors: Martin Ruskov and Lora Taseva
- Abstract summary: We show how we model various types of asymmetric translation correlates and the variability resulting from the pluralism of sources.
Our approach is designed with generalisation in mind and is intended to be applicable also for other translations from Greek into Old Church Slavonic.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The development of bilingual dictionaries to medieval translations presents
diverse difficulties. These result from two types of philological
circumstances: a) the asymmetry between the source language and the target
language; and b) the varying available sources of both the original and
translated texts. In particular, the full critical edition of Tihova of
Constantine of Preslav's Uchitel'noe evangelie ('Didactic Gospel') gives a
relatively good idea of the Old Church Slavonic translation but not of its
Greek source text. This is due to the fact that Cramer's edition of the catenae
- used as the parallel text in it - is based on several codices whose text does
not fully coincide with the Slavonic. This leads to the addition of the
newly-discovered parallels from Byzantine manuscripts and John Chrysostom's
homilies. Our approach to these issues is a step-wise process with two main
goals: a) to facilitate the philological annotation of input data and b) to
consider the manifestations of the mentioned challenges, first, separately in
order to simplify their resolution, and, then, in their combination. We
demonstrate how we model various types of asymmetric translation correlates and
the variability resulting from the pluralism of sources. We also demonstrate
how all these constructions are being modelled and processed into the final
indices. Our approach is designed with generalisation in mind and is intended
to be applicable also for other translations from Greek into Old Church
Slavonic.
Related papers
- Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction [73.26364649572237]
Oracle Bone Inscriptions is one of the oldest existing forms of writing in the world.
A large number of Oracle Bone Inscriptions (OBI) remain undeciphered, making it one of the global challenges in paleography today.
This paper introduces a novel approach, namely Puzzle Pieces Picker (P$3$), to decipher these enigmatic characters through radical reconstruction.
arXiv Detail & Related papers (2024-06-05T07:34:39Z) - The Problem of Alignment [1.2277343096128712]
Large Language Models produce sequences learned as statistical patterns from large corpora.
After initial training models must be aligned with human values, prefer certain continuations over others.
We examine this practice of structuration as a two-way interaction between users and models.
arXiv Detail & Related papers (2023-12-30T11:44:59Z) - Graecia capta ferum victorem cepit. Detecting Latin Allusions to Ancient
Greek Literature [23.786649328915097]
We introduce SPhilBERTa, a trilingual Sentence-RoBERTa model tailored for Classical Philology.
It excels at cross-lingual semantic comprehension and identification of identical sentences across Ancient Greek, Latin, and English.
We generate new training data by automatically translating English texts into Ancient Greek.
arXiv Detail & Related papers (2023-08-23T08:54:05Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - Learning Contextualised Cross-lingual Word Embeddings and Alignments for
Extremely Low-Resource Languages Using Parallel Corpora [63.5286019659504]
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus.
Our method obtains word embeddings via an LSTM encoder-decoder model that simultaneously translates and reconstructs an input sentence.
arXiv Detail & Related papers (2020-10-27T22:24:01Z) - Deciphering Undersegmented Ancient Scripts Using Phonetic Prior [31.707254394215283]
Most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges.
We propose a model that handles both of these challenges by building on rich linguistic constraints.
We evaluate the model on both deciphered languages (Gothic, Ugaritic) and an undeciphered one (Iberian)
arXiv Detail & Related papers (2020-10-21T15:03:52Z) - Constructing a Family Tree of Ten Indo-European Languages with
Delexicalized Cross-linguistic Transfer Patterns [57.86480614673034]
We formalize the delexicalized transfer as interpretable tree-to-string and tree-to-tree patterns.
This allows us to quantitatively probe cross-linguistic transfer and extend inquiries of Second Language Acquisition.
arXiv Detail & Related papers (2020-07-17T15:56:54Z) - The Frankfurt Latin Lexicon: From Morphological Expansion and Word
Embeddings to SemioGraphs [97.8648124629697]
The article argues for a more comprehensive understanding of lemmatization, encompassing classical machine learning as well as intellectual post-corrections and, in particular, human interpretation processes based on graph representations of the underlying lexical resources.
arXiv Detail & Related papers (2020-05-21T17:16:53Z) - HELFI: a Hebrew-Greek-Finnish Parallel Bible Corpus with Cross-Lingual
Morpheme Alignment [0.0]
Twenty-five years ago, morphologically aligned Hebrew-Finnish and Greek-Finnish bitexts were constructed manually.
This paper describes a nontrivial editorial process starting from the creation of the original one-purpose database.
It ends with its reconstruction using only freely available text editions and annotations.
arXiv Detail & Related papers (2020-03-16T22:10:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.