Morphology and Syntax of the Tamil Language
- URL: http://arxiv.org/abs/2401.08367v1
- Date: Tue, 16 Jan 2024 13:52:25 GMT
- Title: Morphology and Syntax of the Tamil Language
- Authors: Kengatharaiyer Sarveswaran
- Abstract summary: The paper highlights the complexity and richness of Tamil in terms of its morphological and syntactic features.
It is proven as a rule-based morphological analyser cum generator and a computational grammar for Tamil have already been developed based on this paper.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper provides an overview of the morphology and syntax of the Tamil
language, focusing on its contemporary usage. The paper also highlights the
complexity and richness of Tamil in terms of its morphological and syntactic
features, which will be useful for linguists analysing the language and
conducting comparative studies. In addition, the paper will be useful for those
developing computational resources for the Tamil language. It is proven as a
rule-based morphological analyser cum generator and a computational grammar for
Tamil have already been developed based on this paper. To enhance accessibility
for a broader audience, the analysis is conducted without relying on any
specific grammatical formalism.
Related papers
- IruMozhi: Automatically classifying diglossia in Tamil [4.329125081222602]
Spoken Tamil is under-supported in modern NLP systems.
We release IruMozhi, a human-annotated dataset of parallel text in Literary and Spoken Tamil.
arXiv Detail & Related papers (2023-11-13T23:36:35Z) - Teacher Perception of Automatically Extracted Grammar Concepts for L2
Language Learning [66.79173000135717]
We apply this work to teaching two Indian languages, Kannada and Marathi, which do not have well-developed resources for second language learning.
We extract descriptions from a natural text corpus that answer questions about morphosyntax (learning of word order, agreement, case marking, or word formation) and semantics (learning of vocabulary).
We enlist the help of language educators from schools in North America to perform a manual evaluation, who find the materials have potential to be used for their lesson preparation and learner evaluation.
arXiv Detail & Related papers (2023-10-27T18:17:29Z) - Teacher Perception of Automatically Extracted Grammar Concepts for L2
Language Learning [91.49622922938681]
We present an automatic framework that automatically discovers and visualizing descriptions of different aspects of grammar.
Specifically, we extract descriptions from a natural text corpus that answer questions about morphosyntax and semantics.
We apply this method for teaching the Indian languages, Kannada and Marathi, which, unlike English, do not have well-developed pedagogical resources.
arXiv Detail & Related papers (2022-06-10T14:52:22Z) - Quantifying Synthesis and Fusion and their Impact on Machine Translation [79.61874492642691]
In Natural Language Processing (NLP) typically labels a whole language with a strict type of morphology, e.g. fusional or agglutinative.
In this work, we propose to reduce the rigidity of such claims, by quantifying morphological typology at the word and segment level.
For computing literature, we test unsupervised and supervised morphological segmentation methods for English, German and Turkish, whereas for fusion, we propose a semi-automatic method using Spanish as a case study.
Then, we analyse the relationship between machine translation quality and the degree of synthesis and fusion at word (nouns and verbs for English-Turkish,
arXiv Detail & Related papers (2022-05-06T17:04:58Z) - Urdu Morphology, Orthography and Lexicon Extraction [0.0]
This paper describes an implementation of the Urdu language as a software API.
We deal with orthography, morphology and the extraction of the lexicon.
arXiv Detail & Related papers (2022-04-06T20:14:01Z) - Morpheme Boundary Detection & Grammatical Feature Prediction for
Gujarati : Dataset & Model [0.0]
We have used a Bi-Directional LSTM based approach to perform morpheme boundary detection and grammatical feature tagging.
This is the first dataset and morph analyzer model for the Gujarati language which performs both grammatical feature tagging and morpheme boundary detection tasks.
arXiv Detail & Related papers (2021-12-18T06:58:36Z) - Evaluating the Morphosyntactic Well-formedness of Generated Texts [88.20502652494521]
We propose L'AMBRE -- a metric to evaluate the morphosyntactic well-formedness of text.
We show the effectiveness of our metric on the task of machine translation through a diachronic study of systems translating into morphologically-rich languages.
arXiv Detail & Related papers (2021-03-30T18:02:58Z) - A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns
Analysis [0.755972004983746]
This paper presents first benchmark corpus of Sanskrit Pratyaya (suffix) and inflectional words (padas) formed due to suffixes.
In this work, we prepared a Sanskrit suffix benchmark corpus called Pratyaya-Kosh to evaluate the performance of tools.
We also present our own neural approach for derivative nouns analysis while evaluating the same on most prominent Sanskrit Morphological Analysis tools.
arXiv Detail & Related papers (2020-10-24T17:22:44Z) - Linguistic Typology Features from Text: Inferring the Sparse Features of
World Atlas of Language Structures [73.06435180872293]
We construct a recurrent neural network predictor based on byte embeddings and convolutional layers.
We show that some features from various linguistic types can be predicted reliably.
arXiv Detail & Related papers (2020-04-30T21:00:53Z) - Evaluating Transformer-Based Multilingual Text Classification [55.53547556060537]
We argue that NLP tools perform unequally across languages with different syntactic and morphological structures.
We calculate word order and morphological similarity indices to aid our empirical study.
arXiv Detail & Related papers (2020-04-29T03:34:53Z) - A Finite State Transducer Based Morphological Analyzer of Maithili
Language [2.752817022620644]
We present a finite state transducer based inflectional morphological analyzer for a resource poor language of India, known as Maithili.
Maithili is an eastern Indo-Aryan language spoken in the eastern and northern regions of Bihar in India and the southeastern plains, known as tarai of Nepal.
arXiv Detail & Related papers (2020-02-29T11:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.