The Better Your Syntax, the Better Your Semantics? Probing Pretrained
Language Models for the English Comparative Correlative
- URL: http://arxiv.org/abs/2210.13181v1
- Date: Mon, 24 Oct 2022 13:01:24 GMT
- Title: The Better Your Syntax, the Better Your Semantics? Probing Pretrained
Language Models for the English Comparative Correlative
- Authors: Leonie Weissweiler, Valentin Hofmann, Abdullatif K\"oksal, Hinrich
Sch\"utze
- Abstract summary: Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics.
We present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC)
Our results show that all three investigated PLMs are able to recognise the structure of the CC but fail to use its meaning.
- Score: 7.03497683558609
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Construction Grammar (CxG) is a paradigm from cognitive linguistics
emphasising the connection between syntax and semantics. Rather than rules that
operate on lexical items, it posits constructions as the central building
blocks of language, i.e., linguistic units of different granularity that
combine syntax and semantics. As a first step towards assessing the
compatibility of CxG with the syntactic and semantic knowledge demonstrated by
state-of-the-art pretrained language models (PLMs), we present an investigation
of their capability to classify and understand one of the most commonly studied
constructions, the English comparative correlative (CC). We conduct experiments
examining the classification accuracy of a syntactic probe on the one hand and
the models' behaviour in a semantic application task on the other, with BERT,
RoBERTa, and DeBERTa as the example PLMs. Our results show that all three
investigated PLMs are able to recognise the structure of the CC but fail to use
its meaning. While human-like performance of PLMs on many NLP tasks has been
alleged, this indicates that PLMs still suffer from substantial shortcomings in
central domains of linguistic knowledge.
Related papers
- Interpretability of Language Models via Task Spaces [14.543168558734001]
We present an alternative approach to interpret language models (LMs)
We focus on the quality of LM processing, with a focus on their language abilities.
We construct 'linguistic task spaces' that shed light on the connections LMs draw between language phenomena.
arXiv Detail & Related papers (2024-06-10T16:34:30Z) - Holmes: A Benchmark to Assess the Linguistic Competence of Language Models [59.627729608055006]
We introduce Holmes, a new benchmark designed to assess language models (LMs) linguistic competence.
We use computation-based probing to examine LMs' internal representations regarding distinct linguistic phenomena.
As a result, we meet recent calls to disentangle LMs' linguistic competence from other cognitive abilities.
arXiv Detail & Related papers (2024-04-29T17:58:36Z) - Probing LLMs for Joint Encoding of Linguistic Categories [10.988109020181563]
We propose a framework for testing the joint encoding of linguistic categories in Large Language Models (LLMs)
We find evidence of joint encoding both at the same (related part-of-speech (POS) classes) and different (POS classes and related syntactic dependency relations) levels of linguistic hierarchy.
arXiv Detail & Related papers (2023-10-28T12:46:40Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Embracing Ambiguity: Improving Similarity-oriented Tasks with Contextual
Synonym Knowledge [30.010315144903885]
Contextual synonym knowledge is crucial for similarity-oriented tasks.
Most Pre-trained Language Models (PLMs) lack synonym knowledge due to inherent limitations of their pre-training objectives.
We propose PICSO, a flexible framework that supports the injection of contextual synonym knowledge from multiple domains into PLMs.
arXiv Detail & Related papers (2022-11-20T15:25:19Z) - Improving Pre-trained Language Models with Syntactic Dependency
Prediction Task for Chinese Semantic Error Recognition [52.55136323341319]
Existing Chinese text error detection mainly focuses on spelling and simple grammatical errors.
Chinese semantic errors are understudied and more complex that humans cannot easily recognize.
arXiv Detail & Related papers (2022-04-15T13:55:32Z) - Integrating Language Guidance into Vision-based Deep Metric Learning [78.18860829585182]
We propose to learn metric spaces which encode semantic similarities as embedding space.
These spaces should be transferable to classes beyond those seen during training.
This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes.
arXiv Detail & Related papers (2022-03-16T11:06:50Z) - Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese
Language Models [22.57309958548928]
We investigate whether structural supervision improves language models' ability to learn grammatical dependencies in typologically different languages.
We train LSTMs, Recurrent Neural Network Grammars, Transformer language models, and generative parsing models on datasets of different sizes.
We find suggestive evidence that structural supervision helps with representing syntactic state across intervening content and improves performance in low-data settings.
arXiv Detail & Related papers (2021-09-22T22:11:30Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM
Language Models [22.826154706036995]
LSTM-based recurrent neural networks are the state-of-the-art for many natural language processing (NLP) tasks.
Lacking this understanding, the generality of LSTM performance on this task and their suitability for related tasks remains uncertain.
We introduce *influence paths*, a causal account of structural properties as carried by paths across gates and neurons of a recurrent neural network.
arXiv Detail & Related papers (2020-05-03T21:10:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.