The correlation between nativelike selection and prototypicality: a multilingual onomasiological case study using semantic embedding
- URL: http://arxiv.org/abs/2405.13529v1
- Date: Wed, 22 May 2024 10:55:26 GMT
- Title: The correlation between nativelike selection and prototypicality: a multilingual onomasiological case study using semantic embedding
- Authors: Huasheng Zhang,
- Abstract summary: This study examines the possibility of analyzing the semantic motivation and deducibility behind some nativelike selection (NLS)
To account for the NLS in question, cluster analysis and behavioral profile analysis are conducted to uncover a language-specific prototype for the Chinese verb shang 'harm'
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In native speakers' lexical choices, a concept can be more readily expressed by one expression over another grammatical one, a phenomenon known as nativelike selection (NLS). In previous research, arbitrary chunks such as collocations have been considered crucial for this phenomenon. However, this study examines the possibility of analyzing the semantic motivation and deducibility behind some NLSs by exploring the correlation between NLS and prototypicality, specifically the onomasiological hypothesis of Grondelaers and Geeraerts (2003, Towards a pragmatic model of cognitive onomasiology. In Hubert Cuyckens, Ren\'e Dirven & John R. Taylor (eds.), Cognitive approaches to lexical semantics, 67-92. Berlin: De Gruyter Mouton). They hypothesized that "[a] referent is more readily named by a lexical item if it is a salient member of the category denoted by that item". To provide a preliminary investigation of this important but rarely explored phenomenon, a series of innovative methods and procedures, including the use of semantic embedding and interlingual comparisons, is designed. Specifically, potential NLSs are efficiently discovered through an automatic exploratory analysis using topic modeling techniques, and then confirmed by manual inspection through frame semantics. Finally, to account for the NLS in question, cluster analysis and behavioral profile analysis are conducted to uncover a language-specific prototype for the Chinese verb shang 'harm', providing supporting evidence for the correlation between NLS and prototypicality.
Related papers
- Large Language Models as Neurolinguistic Subjects: Identifying Internal Representations for Form and Meaning [49.60849499134362]
This study investigates the linguistic understanding of Large Language Models (LLMs) regarding signifier (form) and signified (meaning)
Traditional psycholinguistic evaluations often reflect statistical biases that may misrepresent LLMs' true linguistic capabilities.
We introduce a neurolinguistic approach, utilizing a novel method that combines minimal pair and diagnostic probing to analyze activation patterns across model layers.
arXiv Detail & Related papers (2024-11-12T04:16:44Z) - Holmes: A Benchmark to Assess the Linguistic Competence of Language Models [59.627729608055006]
We introduce Holmes, a new benchmark designed to assess language models (LMs) linguistic competence.
We use computation-based probing to examine LMs' internal representations regarding distinct linguistic phenomena.
As a result, we meet recent calls to disentangle LMs' linguistic competence from other cognitive abilities.
arXiv Detail & Related papers (2024-04-29T17:58:36Z) - Large Language Models Are Partially Primed in Pronoun Interpretation [6.024776891570197]
We investigate whether large language models (LLMs) display human-like referential biases using stimuli and procedures from real psycholinguistic experiments.
Recent psycholinguistic studies suggest that humans adapt their referential biases with recent exposure to referential patterns.
We find that InstructGPT adapts its pronominal interpretations in response to the frequency of referential patterns in the local discourse.
arXiv Detail & Related papers (2023-05-26T13:30:48Z) - Interventional Probing in High Dimensions: An NLI Case Study [2.1028463367241033]
Probing strategies have been shown to detect semantic features intermediate to the "natural logic" fragment of the Natural Language Inference task (NLI)
In this work, we carry out new and existing representation-level interventions to investigate the effect of these semantic features on NLI classification.
arXiv Detail & Related papers (2023-04-20T14:34:31Z) - Testing Pre-trained Language Models' Understanding of Distributivity via
Causal Mediation Analysis [13.07356367140208]
We introduce DistNLI, a new diagnostic dataset for natural language inference.
We find that the extent of models' understanding is associated with model size and vocabulary size.
arXiv Detail & Related papers (2022-09-11T00:33:28Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities.
We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form.
We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z) - A Comparative Study of Lexical Substitution Approaches based on Neural
Language Models [117.96628873753123]
We present a large-scale comparative study of popular neural language and masked language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly.
arXiv Detail & Related papers (2020-05-29T18:43:22Z) - A Matter of Framing: The Impact of Linguistic Formalism on Probing
Results [69.36678873492373]
Deep pre-trained contextualized encoders like BERT (Delvin et al.) demonstrate remarkable performance on a range of downstream tasks.
Recent research in probing investigates the linguistic knowledge implicitly learned by these models during pre-training.
Can the choice of formalism affect probing results?
We find linguistically meaningful differences in the encoding of semantic role- and proto-role information by BERT depending on the formalism.
arXiv Detail & Related papers (2020-04-30T17:45:16Z) - Where New Words Are Born: Distributional Semantic Analysis of Neologisms
and Their Semantic Neighborhoods [51.34667808471513]
We investigate the importance of two factors, semantic sparsity and frequency growth rates of semantic neighbors, formalized in the distributional semantics paradigm.
We show that both factors are predictive word emergence although we find more support for the latter hypothesis.
arXiv Detail & Related papers (2020-01-21T19:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.