Understanding the Use of Quantifiers in Mandarin
- URL: http://arxiv.org/abs/2209.11977v1
- Date: Sat, 24 Sep 2022 10:43:07 GMT
- Title: Understanding the Use of Quantifiers in Mandarin
- Authors: Guanyi Chen, Kees van Deemter
- Abstract summary: We introduce a corpus of short texts in Mandarin, in which quantified expressions figure prominently.
We examine the hypothesis that speakers of East Asian languages speak more briefly but less informatively than speakers of West-European languages.
- Score: 7.249126423531564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a corpus of short texts in Mandarin, in which quantified
expressions figure prominently. We illustrate the significance of the corpus by
examining the hypothesis (known as Huang's "coolness" hypothesis) that speakers
of East Asian Languages tend to speak more briefly but less informatively than,
for example, speakers of West-European languages. The corpus results from an
elicitation experiment in which participants were asked to describe abstract
visual scenes. We compare the resulting corpus, called MQTUNA, with an English
corpus that was collected using the same experimental paradigm. The comparison
reveals that some, though not all, aspects of quantifier use support the
above-mentioned hypothesis. Implications of these findings for the generation
of quantified noun phrases are discussed.
Related papers
- To Drop or Not to Drop? Predicting Argument Ellipsis Judgments: A Case Study in Japanese [26.659122101710068]
We study whether and why a particular argument should be omitted across over 2,000 data points in the balanced corpus of Japanese.
The data indicate that native speakers overall share common criteria for such judgments.
The gap between the systems' prediction and human judgments in specific linguistic aspects is revealed.
arXiv Detail & Related papers (2024-04-17T12:26:52Z) - Computational Modelling of Plurality and Definiteness in Chinese Noun
Phrases [13.317456093426808]
We focus on the omission of the plurality and definiteness markers in Chinese noun phrases (NPs)
We build a corpus of Chinese NPs, each of which is accompanied by its corresponding context, and by labels indicating its singularity/plurality and definiteness/indefiniteness.
We train a bank of computational models using both classic machine learning models and state-of-the-art pre-trained language models to predict the plurality and definiteness of each NP.
arXiv Detail & Related papers (2024-03-07T10:06:54Z) - Discourse Representation Structure Parsing for Chinese [8.846860617823005]
We explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations.
We propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance.
Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs.
arXiv Detail & Related papers (2023-06-16T09:47:45Z) - BabySLM: language-acquisition-friendly benchmark of self-supervised
spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.
We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels.
We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z) - Large Language Models Are Partially Primed in Pronoun Interpretation [6.024776891570197]
We investigate whether large language models (LLMs) display human-like referential biases using stimuli and procedures from real psycholinguistic experiments.
Recent psycholinguistic studies suggest that humans adapt their referential biases with recent exposure to referential patterns.
We find that InstructGPT adapts its pronominal interpretations in response to the frequency of referential patterns in the local discourse.
arXiv Detail & Related papers (2023-05-26T13:30:48Z) - Natural Language Decompositions of Implicit Content Enable Better Text
Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.
We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.
Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - Prompting Large Language Model for Machine Translation: A Case Study [87.88120385000666]
We offer a systematic study on prompting strategies for machine translation.
We examine factors for prompt template and demonstration example selection.
We explore the use of monolingual data and the feasibility of cross-lingual, cross-domain, and sentence-to-document transfer learning.
arXiv Detail & Related papers (2023-01-17T18:32:06Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - Explaining Latent Representations with a Corpus of Examples [72.50996504722293]
We propose SimplEx: a user-centred method that provides example-based explanations with reference to a freely selected set of examples.
SimplEx uses the corpus to improve the user's understanding of the latent space with post-hoc explanations.
We show that SimplEx empowers the user by highlighting relevant patterns in the corpus that explain model representations.
arXiv Detail & Related papers (2021-10-28T17:59:06Z) - Investigating Cross-Linguistic Adjective Ordering Tendencies with a
Latent-Variable Model [66.84264870118723]
We present the first purely corpus-driven model of multi-lingual adjective ordering in the form of a latent-variable model.
We provide strong converging evidence for the existence of universal, cross-linguistic, hierarchical adjective ordering tendencies.
arXiv Detail & Related papers (2020-10-09T18:27:55Z) - A Corpus of Adpositional Supersenses for Mandarin Chinese [15.757892250956715]
This paper presents a corpus in which all adpositions have been semantically annotated in Mandarin Chinese.
Our approach adapts a framework that defined a general set of supersenses according to ostensibly language-independent semantic criteria.
We find that the supersense categories are well-suited to Chinese adpositions despite syntactic differences from English.
arXiv Detail & Related papers (2020-03-18T18:59:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.