BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking
Scalar Adjectives with Contextualised Representations
- URL: http://arxiv.org/abs/2010.02686v1
- Date: Tue, 6 Oct 2020 13:05:47 GMT
- Title: BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking
Scalar Adjectives with Contextualised Representations
- Authors: Aina Gar\'i Soler, Marianna Apidianaki
- Abstract summary: We propose a novel BERT-based approach to intensity detection for scalar adjectives.
We model intensity by vectors directly derived from contextualised representations and show they can successfully rank scalar adjectives.
- Score: 6.167728295758172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adjectives like pretty, beautiful and gorgeous describe positive properties
of the nouns they modify but with different intensity. These differences are
important for natural language understanding and reasoning. We propose a novel
BERT-based approach to intensity detection for scalar adjectives. We model
intensity by vectors directly derived from contextualised representations and
show they can successfully rank scalar adjectives. We evaluate our models both
intrinsically, on gold standard datasets, and on an Indirect Question Answering
task. Our results demonstrate that BERT encodes rich knowledge about the
semantics of scalar adjectives, and is able to provide better quality intensity
rankings than static embeddings and previous models with access to dedicated
resources.
Related papers
- Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models [70.82728812001807]
A straightforward pipeline for zero-shot out-of-distribution (OOD) detection involves selecting potential OOD labels from an extensive semantic pool.
We theorize that enhancing performance requires expanding the semantic pool.
We show that expanding OOD label candidates with the CSP satisfies the requirements and outperforms existing works by 7.89% in FPR95.
arXiv Detail & Related papers (2024-10-11T08:24:11Z) - Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics [11.79778723980276]
We probe different families of Large Language Models for their knowledge of the lexical semantics of scalar adjectives.
We find that they encode rich lexical-semantic information about scalar adjectives.
We also compare current models of different sizes and complexities and find that larger models are not always better.
arXiv Detail & Related papers (2024-04-04T08:52:25Z) - Instruction-following Evaluation through Verbalizer Manipulation [64.73188776428799]
We propose a novel instruction-following evaluation protocol called verbalizer manipulation.
It instructs the model to verbalize the task label with words aligning with model priors to different extents.
We observe that the instruction-following abilities of models, across different families and scales, are significantly distinguished by their performance on less natural verbalizers.
arXiv Detail & Related papers (2023-07-20T03:54:24Z) - Not wacky vs. definitely wacky: A study of scalar adverbs in pretrained
language models [0.0]
Modern pretrained language models, such as BERT, RoBERTa and GPT-3 hold the promise of performing better on logical tasks than classic static word embeddings.
We investigate the extent to which BERT, RoBERTa, GPT-2 and GPT-3 exhibit general, human-like, knowledge of these common words.
We find that despite capturing some aspects of logical meaning, the models fall far short of human performance.
arXiv Detail & Related papers (2023-05-25T18:56:26Z) - Visualizing the Obvious: A Concreteness-based Ensemble Model for Noun
Property Prediction [34.37730333491428]
properties of nouns are more challenging to extract compared to other types of knowledge because they are rarely explicitly stated in texts.
We propose to extract these properties from images and use them in an ensemble model, in order to complement the information that is extracted from language models.
Our results show that the proposed combination of text and images greatly improves noun property prediction compared to powerful text-based language models.
arXiv Detail & Related papers (2022-10-24T01:25:21Z) - Disentangled Action Recognition with Knowledge Bases [77.77482846456478]
We aim to improve the generalization ability of the compositional action recognition model to novel verbs or novel nouns.
Previous work utilizes verb-noun compositional action nodes in the knowledge graph, making it inefficient to scale.
We propose our approach: Disentangled Action Recognition with Knowledge-bases (DARK), which leverages the inherent compositionality of actions.
arXiv Detail & Related papers (2022-07-04T20:19:13Z) - On Guiding Visual Attention with Language Specification [76.08326100891571]
We use high-level language specification as advice for constraining the classification evidence to task-relevant features, instead of distractors.
We show that supervising spatial attention in this way improves performance on classification tasks with biased and noisy data.
arXiv Detail & Related papers (2022-02-17T22:40:19Z) - ALL Dolphins Are Intelligent and SOME Are Friendly: Probing BERT for
Nouns' Semantic Properties and their Prototypicality [4.915907527975786]
We probe BERT (Devlin et al.) for the construction of English nouns as expressed by adjectives that do not restrict the reference scope.
We base our study on psycholinguistics datasets that capture the association strength between nouns and their semantic features.
We show that when tested in a fine-tuning setting addressing entailment, BERT successfully leverages the information needed for reasoning about the meaning of adjective-nouns.
arXiv Detail & Related papers (2021-10-12T21:43:37Z) - Scalar Adjective Identification and Multilingual Ranking [4.915907527975786]
We introduce a new multilingual dataset in order to promote research on scalar adjectives in new languages.
We perform a series of experiments and set performance baselines on this dataset, using monolingual and multilingual contextual language models.
We introduce a new binary classification task for English scalar adjective identification.
arXiv Detail & Related papers (2021-05-03T21:32:41Z) - Investigating Cross-Linguistic Adjective Ordering Tendencies with a
Latent-Variable Model [66.84264870118723]
We present the first purely corpus-driven model of multi-lingual adjective ordering in the form of a latent-variable model.
We provide strong converging evidence for the existence of universal, cross-linguistic, hierarchical adjective ordering tendencies.
arXiv Detail & Related papers (2020-10-09T18:27:55Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.