Syntactic Variation Across the Grammar: Modelling a Complex Adaptive
System
- URL: http://arxiv.org/abs/2309.11869v1
- Date: Thu, 21 Sep 2023 08:14:34 GMT
- Title: Syntactic Variation Across the Grammar: Modelling a Complex Adaptive
System
- Authors: Jonathan Dunn
- Abstract summary: We model dialectal variation across 49 local populations of English speakers in 16 countries.
Results show that an important part of syntactic variation consists of interactions between different parts of the grammar.
New Zealand English could be more similar to Australian English in phrasal verbs but at the same time more similar to UK English in dative phrases.
- Score: 0.76146285961466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While language is a complex adaptive system, most work on syntactic variation
observes a few individual constructions in isolation from the rest of the
grammar. This means that the grammar, a network which connects thousands of
structures at different levels of abstraction, is reduced to a few disconnected
variables. This paper quantifies the impact of such reductions by
systematically modelling dialectal variation across 49 local populations of
English speakers in 16 countries. We perform dialect classification with both
an entire grammar as well as with isolated nodes within the grammar in order to
characterize the syntactic differences between these dialects. The results
show, first, that many individual nodes within the grammar are subject to
variation but, in isolation, none perform as well as the grammar as a whole.
This indicates that an important part of syntactic variation consists of
interactions between different parts of the grammar. Second, the results show
that the similarity between dialects depends heavily on the sub-set of the
grammar being observed: for example, New Zealand English could be more similar
to Australian English in phrasal verbs but at the same time more similar to UK
English in dative phrases.
Related papers
- Syntactic Language Change in English and German: Metrics, Parsers, and Convergences [56.47832275431858]
The current paper looks at diachronic trends in syntactic language change in both English and German, using corpora of parliamentary debates from the last c. 160 years.
We base our observations on five dependencys, including the widely used Stanford Core as well as 4 newer alternatives.
We show that changes in syntactic measures seem to be more frequent at the tails of sentence length distributions.
arXiv Detail & Related papers (2024-02-18T11:46:16Z) - DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules [64.93179829965072]
DADA is a modular approach to imbue SAE-trained models with multi-dialectal robustness.
We show that DADA is effective for both single task and instruction fine language models.
arXiv Detail & Related papers (2023-05-22T18:43:31Z) - Quantifying the Roles of Visual, Linguistic, and Visual-Linguistic
Complexity in Verb Acquisition [8.183763443800348]
We employ visual and linguistic representations of words sourced from pre-trained artificial neural networks.
We find that the representation of verbs is generally more variable and less discriminable within domain than the representation of nouns.
Visual variability is the strongest factor that internally drives verb learning, followed by visual-linguistic alignment and linguistic variability.
arXiv Detail & Related papers (2023-04-05T15:08:21Z) - Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is
It and How Does It Affect Transfer? [50.48082721476612]
Multilingual BERT (mBERT) has demonstrated considerable cross-lingual syntactic ability.
We investigate the distributions of grammatical relations induced from mBERT in the context of 24 typologically different languages.
arXiv Detail & Related papers (2022-12-21T09:44:08Z) - Multi-VALUE: A Framework for Cross-Dialectal English NLP [49.55176102659081]
Multi- Dialect is a controllable rule-based translation system spanning 50 English dialects.
Stress tests reveal significant performance disparities for leading models on non-standard dialects.
We partner with native speakers of Chicano and Indian English to release new gold-standard variants of the popular CoQA task.
arXiv Detail & Related papers (2022-12-15T18:17:01Z) - Stability of Syntactic Dialect Classification Over Space and Time [0.0]
This paper constructs a test set for 12 dialects of English that spans three years at monthly intervals with a fixed spatial distribution across 1,120 cities.
The decay rate of classification performance for each dialect over time allows us to identify regions undergoing syntactic change.
And the distribution of classification accuracy within dialect regions allows us to identify the degree to which the grammar of a dialect is internally heterogeneous.
arXiv Detail & Related papers (2022-09-11T23:14:59Z) - Do Not Fire the Linguist: Grammatical Profiles Help Language Models
Detect Semantic Change [6.7485485663645495]
We first compare the performance of grammatical profiles against that of a multilingual neural language model (XLM-R) on 10 datasets, covering 7 languages.
Our results show that ensembling grammatical profiles with XLM-R improves semantic change detection performance for most datasets and languages.
arXiv Detail & Related papers (2022-04-12T11:20:42Z) - Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese
Language Models [22.57309958548928]
We investigate whether structural supervision improves language models' ability to learn grammatical dependencies in typologically different languages.
We train LSTMs, Recurrent Neural Network Grammars, Transformer language models, and generative parsing models on datasets of different sizes.
We find suggestive evidence that structural supervision helps with representing syntactic state across intervening content and improves performance in low-data settings.
arXiv Detail & Related papers (2021-09-22T22:11:30Z) - Word Frequency Does Not Predict Grammatical Knowledge in Language Models [2.1984302611206537]
We investigate whether there are systematic sources of variation in the language models' accuracy.
We find that certain nouns are systematically understood better than others, an effect which is robust across grammatical tasks and different language models.
We find that a novel noun's grammatical properties can be few-shot learned from various types of training data.
arXiv Detail & Related papers (2020-10-26T19:51:36Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - A Simple Joint Model for Improved Contextual Neural Lemmatization [60.802451210656805]
We present a simple joint neural model for lemmatization and morphological tagging that achieves state-of-the-art results on 20 languages.
Our paper describes the model in addition to training and decoding procedures.
arXiv Detail & Related papers (2019-04-04T02:03:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.