Context vs Target Word: Quantifying Biases in Lexical Semantic Datasets
- URL: http://arxiv.org/abs/2112.06733v1
- Date: Mon, 13 Dec 2021 15:37:05 GMT
- Title: Context vs Target Word: Quantifying Biases in Lexical Semantic Datasets
- Authors: Qianchu Liu, Diana McCarthy, Anna Korhonen
- Abstract summary: State-of-the-art contextualized models such as BERT use tasks such as WiC and WSD to evaluate their word-in-context representations.
This study presents the first quantitative analysis (using probing baselines) on the context-word interaction being tested in major contextual lexical semantic tasks.
- Score: 18.754562380068815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art contextualized models such as BERT use tasks such as WiC and
WSD to evaluate their word-in-context representations. This inherently assumes
that performance in these tasks reflect how well a model represents the coupled
word and context semantics. This study investigates this assumption by
presenting the first quantitative analysis (using probing baselines) on the
context-word interaction being tested in major contextual lexical semantic
tasks. Specifically, based on the probing baseline performance, we propose
measures to calculate the degree of context or word biases in a dataset, and
plot existing datasets on a continuum. The analysis shows most existing
datasets fall into the extreme ends of the continuum (i.e. they are either
heavily context-biased or target-word-biased) while only AM$^2$iCo and Sense
Retrieval challenge a model to represent both the context and target words. Our
case study on WiC reveals that human subjects do not share models' strong
context biases in the dataset (humans found semantic judgments much more
difficult when the target word is missing) and models are learning spurious
correlations from context alone. This study demonstrates that models are
usually not being tested for word-in-context representations as such in these
tasks and results are therefore open to misinterpretation. We recommend our
framework as sanity check for context and target word biases of future task
design and application in lexical semantics.
Related papers
- How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - A Novel Multidimensional Reference Model For Heterogeneous Textual
Datasets Using Context, Semantic And Syntactic Clues [4.453735522794044]
This study aims to produce a novel multidimensional reference model using categories for heterogeneous datasets.
The main contribution of MRM is that it checks each tokens with each term based on indexing of linguistic categories such as synonym, antonym, formal, lexical word order and co-occurrence.
arXiv Detail & Related papers (2023-11-10T17:02:25Z) - Probing Physical Reasoning with Counter-Commonsense Context [34.8562766828087]
This study investigates how physical commonsense affects the contextualized size comparison task.
This dataset tests the ability of language models to predict the size relationship between objects under various contexts.
arXiv Detail & Related papers (2023-06-04T04:24:43Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Can NLP Models Correctly Reason Over Contexts that Break the Common
Assumptions? [14.991565484636745]
We investigate the ability of NLP models to correctly reason over contexts that break the common assumptions.
We show that while doing fairly well on contexts that follow the common assumptions, the models struggle to correctly reason over contexts that break those assumptions.
Specifically, the performance gap is as high as 20% absolute points.
arXiv Detail & Related papers (2023-05-20T05:20:37Z) - Did the Cat Drink the Coffee? Challenging Transformers with Generalized
Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit
Our results show that TLMs can reach performances that are comparable to those achieved by SDM.
However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z) - Words aren't enough, their order matters: On the Robustness of Grounding
Visual Referring Expressions [87.33156149634392]
We critically examine RefCOg, a standard benchmark for visual referring expression recognition.
We show that 83.7% of test instances do not require reasoning on linguistic structure.
We propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT.
arXiv Detail & Related papers (2020-05-04T17:09:15Z) - How Far are We from Effective Context Modeling? An Exploratory Study on
Semantic Parsing in Context [59.13515950353125]
We present a grammar-based decoding semantic parsing and adapt typical context modeling methods on top of it.
We evaluate 13 context modeling methods on two large cross-domain datasets, and our best model achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-02-03T11:28:10Z) - Don't Judge an Object by Its Context: Learning to Overcome Contextual
Bias [113.44471186752018]
Existing models often leverage co-occurrences between objects and their context to improve recognition accuracy.
This work focuses on addressing such contextual biases to improve the robustness of the learnt feature representations.
arXiv Detail & Related papers (2020-01-09T18:31:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.