Computational Typology
- URL: http://arxiv.org/abs/2504.15642v2
- Date: Mon, 28 Apr 2025 19:03:34 GMT
- Title: Computational Typology
- Authors: Gerhard Jäger,
- Abstract summary: Typology focuses on the study and classification of languages based on their structural features.<n> computational methods have played an increasingly important role in typological research.
- Score: 0.21756081703275998
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Typology is a subfield of linguistics that focuses on the study and classification of languages based on their structural features. Unlike genealogical classification, which examines the historical relationships between languages, typology seeks to understand the diversity of human languages by identifying common properties and patterns, known as universals. In recent years, computational methods have played an increasingly important role in typological research, enabling the analysis of large-scale linguistic data and the testing of hypotheses about language structure and evolution. This article provides an illustration of the benefits of computational statistical modeling in typology.
Related papers
- Morphological Typology in BPE Subword Productivity and Language Modeling [0.0]
We focus on languages with synthetic and analytical morphological structures and examine their productivity when tokenized.
Experiments reveal that languages with synthetic features exhibit greater subword regularity and productivity with BPE tokenization.
arXiv Detail & Related papers (2024-10-31T06:13:29Z) - Explainability of machine learning approaches in forensic linguistics: a case study in geolinguistic authorship profiling [46.58131072375399]
We explore the explainability of machine learning approaches considering the forensic context.
We focus on variety classification as a means of geolinguistic profiling of unknown texts based on social media data from the German-speaking area.
We find that the extracted lexical features are indeed representative of their respective varieties and note that the trained models also rely on place names for classifications.
arXiv Detail & Related papers (2024-04-29T08:52:52Z) - On the Transferability of Neural Models of Morphological Analogies [7.89271130004391]
In this paper, we focus on morphological tasks and we propose a deep learning approach to detect morphological analogies.
We present an empirical study to see how our framework transfers across languages, and that highlights interesting similarities and differences between these languages.
In view of these results, we also discuss the possibility of building a multilingual morphological model.
arXiv Detail & Related papers (2021-08-09T11:08:33Z) - SIGTYP 2020 Shared Task: Prediction of Typological Features [78.95376120154083]
A major drawback hampering broader adoption of typological KBs is that they are sparsely populated.
As typological features often correlate with one another, it is possible to predict them and thus automatically populate typological KBs.
Overall, the task attracted 8 submissions from 5 teams, out of which the most successful methods make use of such feature correlations.
arXiv Detail & Related papers (2020-10-16T08:47:24Z) - The Typology of Polysemy: A Multilingual Distributional Framework [6.753781783859273]
We present a novel framework that quantifies semantic affinity, the cross-linguistic similarity of lexical semantics for a concept.
Our results reveal an intricate interaction between semantic domains and extra-linguistic factors, beyond language phylogeny.
arXiv Detail & Related papers (2020-06-02T22:31:40Z) - Linguistic Typology Features from Text: Inferring the Sparse Features of
World Atlas of Language Structures [73.06435180872293]
We construct a recurrent neural network predictor based on byte embeddings and convolutional layers.
We show that some features from various linguistic types can be predicted reliably.
arXiv Detail & Related papers (2020-04-30T21:00:53Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Evaluating Transformer-Based Multilingual Text Classification [55.53547556060537]
We argue that NLP tools perform unequally across languages with different syntactic and morphological structures.
We calculate word order and morphological similarity indices to aid our empirical study.
arXiv Detail & Related papers (2020-04-29T03:34:53Z) - Deep Learning Based Text Classification: A Comprehensive Review [75.8403533775179]
We provide a review of more than 150 deep learning based models for text classification developed in recent years.
We also provide a summary of more than 40 popular datasets widely used for text classification.
arXiv Detail & Related papers (2020-04-06T02:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.