Usage-based learning of grammatical categories
- URL: http://arxiv.org/abs/2204.10201v1
- Date: Thu, 14 Apr 2022 07:44:25 GMT
- Title: Usage-based learning of grammatical categories
- Authors: Luc Steels, Paul Van Eecke, Katrien Beuls
- Abstract summary: This paper raises the question how these categories can be acquired and where they have come from.
We show that a categorial type network which has scores based on the success in a language interaction leads to the spontaneous formation of grammatical categories.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human languages use a wide range of grammatical categories to constrain which
words or phrases can fill certain slots in grammatical patterns and to express
additional meanings, such as tense or aspect, through morpho-syntactic means.
These grammatical categories, which are most often language-specific and
changing over time, are difficult to define and learn. This paper raises the
question how these categories can be acquired and where they have come from. We
explore a usage-based approach. This means that categories and grammatical
constructions are selected and aligned by their success in language
interactions. We report on a multi-agent experiment in which agents are endowed
with mechanisms for understanding and producing utterances as well as
mechanisms for expanding their inventories using a meta-level learning process
based on pro- and anti-unification. We show that a categorial type network
which has scores based on the success in a language interaction leads to the
spontaneous formation of grammatical categories in tandem with the formation of
grammatical patterns.
Related papers
- Investigating Idiomaticity in Word Representations [9.208145117062339]
We focus on noun compounds of varying levels of idiomaticity in two languages (English and Portuguese)
We present a dataset of minimal pairs containing human idiomaticity judgments for each noun compound at both type and token levels.
We define a set of fine-grained metrics of Affinity and Scaled Similarity to determine how sensitive the models are to perturbations that may lead to changes in idiomaticity.
arXiv Detail & Related papers (2024-11-04T21:05:01Z) - Principles of semantic and functional efficiency in grammatical patterning [1.6267479602370545]
Grammatical features such as number and gender serve two central functions in human languages.
Number and gender encode salient semantic attributes like numerosity and animacy, but offload sentence processing cost by predictably linking words together.
Grammars exhibit consistent organizational patterns across diverse languages, invariably rooted in a semantic foundation.
arXiv Detail & Related papers (2024-10-21T10:49:54Z) - Review of Unsupervised POS Tagging and Its Implications on Language
Acquisition [0.0]
An ability that underlies human syntactic knowledge is determining which words can appear in the similar structures.
In exploring this process, we will review various engineering approaches whose goal is similar to that of a child's.
We will discuss common themes that support the advances in the models and their relevance for language acquisition.
arXiv Detail & Related papers (2023-12-15T19:31:00Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z) - CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class
Classification [57.62886091828512]
We propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix) for many-class classification.
Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification.
arXiv Detail & Related papers (2022-11-11T03:45:59Z) - AUTOLEX: An Automatic Framework for Linguistic Exploration [93.89709486642666]
We propose an automatic framework that aims to ease linguists' discovery and extraction of concise descriptions of linguistic phenomena.
Specifically, we apply this framework to extract descriptions for three phenomena: morphological agreement, case marking, and word order.
We evaluate the descriptions with the help of language experts and propose a method for automated evaluation when human evaluation is infeasible.
arXiv Detail & Related papers (2022-03-25T20:37:30Z) - Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT [7.057643880514415]
We investigate how Multilingual BERT (mBERT) encodes grammar by examining how the high-order grammatical feature of morphosyntactic alignment is manifested across the embedding spaces of different languages.
arXiv Detail & Related papers (2021-01-26T19:21:59Z) - Word Frequency Does Not Predict Grammatical Knowledge in Language Models [2.1984302611206537]
We investigate whether there are systematic sources of variation in the language models' accuracy.
We find that certain nouns are systematically understood better than others, an effect which is robust across grammatical tasks and different language models.
We find that a novel noun's grammatical properties can be few-shot learned from various types of training data.
arXiv Detail & Related papers (2020-10-26T19:51:36Z) - Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text
Classification [52.69730591919885]
We present a semi-supervised adversarial training process that minimizes the maximal loss for label-preserving input perturbations.
We observe significant gains in effectiveness on document and intent classification for a diverse set of languages.
arXiv Detail & Related papers (2020-07-29T19:38:35Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - A Benchmark for Systematic Generalization in Grounded Language
Understanding [61.432407738682635]
Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts.
Modern neural networks, by contrast, struggle to interpret novel compositions.
We introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding.
arXiv Detail & Related papers (2020-03-11T08:40:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.