Scalar Adjective Identification and Multilingual Ranking
- URL: http://arxiv.org/abs/2105.01180v1
- Date: Mon, 3 May 2021 21:32:41 GMT
- Title: Scalar Adjective Identification and Multilingual Ranking
- Authors: Aina Gar\'i Soler and Marianna Apidianaki
- Abstract summary: We introduce a new multilingual dataset in order to promote research on scalar adjectives in new languages.
We perform a series of experiments and set performance baselines on this dataset, using monolingual and multilingual contextual language models.
We introduce a new binary classification task for English scalar adjective identification.
- Score: 4.915907527975786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The intensity relationship that holds between scalar adjectives (e.g., nice <
great < wonderful) is highly relevant for natural language inference and
common-sense reasoning. Previous research on scalar adjective ranking has
focused on English, mainly due to the availability of datasets for evaluation.
We introduce a new multilingual dataset in order to promote research on scalar
adjectives in new languages. We perform a series of experiments and set
performance baselines on this dataset, using monolingual and multilingual
contextual language models. Additionally, we introduce a new binary
classification task for English scalar adjective identification which examines
the models' ability to distinguish scalar from relational adjectives. We probe
contextualised representations and report baseline results for future
comparison on this task.
Related papers
- GradSim: Gradient-Based Language Grouping for Effective Multilingual
Training [13.730907708289331]
We propose GradSim, a language grouping method based on gradient similarity.
Our experiments on three diverse multilingual benchmark datasets show that it leads to the largest performance gains.
Besides linguistic features, the topics of the datasets play an important role for language grouping.
arXiv Detail & Related papers (2023-10-23T18:13:37Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization [80.94424037751243]
In zero-shot multilingual extractive text summarization, a model is typically trained on English dataset and then applied on summarization datasets of other languages.
We propose NLS (Neural Label Search for Summarization), which jointly learns hierarchical weights for different sets of labels together with our summarization model.
We conduct multilingual zero-shot summarization experiments on MLSUM and WikiLingua datasets, and we achieve state-of-the-art results using both human and automatic evaluations.
arXiv Detail & Related papers (2022-04-28T14:02:16Z) - A Data Bootstrapping Recipe for Low Resource Multilingual Relation
Classification [38.83366564843953]
IndoRE is a dataset with 21K entity and relation tagged gold sentences in three Indian languages, plus English.
We start with a multilingual BERT (mBERT) based system that captures entity span positions and type information.
We study the accuracy efficiency tradeoff between expensive gold instances vs. translated and aligned'silver' instances.
arXiv Detail & Related papers (2021-10-18T18:40:46Z) - XL-WiC: A Multilingual Benchmark for Evaluating Semantic
Contextualization [98.61159823343036]
We present the Word-in-Context dataset (WiC) for assessing the ability to correctly model distinct meanings of a word.
We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages.
Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance.
arXiv Detail & Related papers (2020-10-13T15:32:00Z) - Investigating Cross-Linguistic Adjective Ordering Tendencies with a
Latent-Variable Model [66.84264870118723]
We present the first purely corpus-driven model of multi-lingual adjective ordering in the form of a latent-variable model.
We provide strong converging evidence for the existence of universal, cross-linguistic, hierarchical adjective ordering tendencies.
arXiv Detail & Related papers (2020-10-09T18:27:55Z) - BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking
Scalar Adjectives with Contextualised Representations [6.167728295758172]
We propose a novel BERT-based approach to intensity detection for scalar adjectives.
We model intensity by vectors directly derived from contextualised representations and show they can successfully rank scalar adjectives.
arXiv Detail & Related papers (2020-10-06T13:05:47Z) - On the Language Neutrality of Pre-trained Multilingual Representations [70.93503607755055]
We investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical semantics.
Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings.
We show how to reach state-of-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences.
arXiv Detail & Related papers (2020-04-09T19:50:32Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.