Polish Natural Language Inference and Factivity -- an Expert-based
Dataset and Benchmarks
- URL: http://arxiv.org/abs/2201.03521v1
- Date: Mon, 10 Jan 2022 18:32:55 GMT
- Title: Polish Natural Language Inference and Factivity -- an Expert-based
Dataset and Benchmarks
- Authors: Daniel Ziembicki, Anna Wr\'oblewska, Karolina Seweryn
- Abstract summary: The dataset contains entirely natural language utterances in Polish.
It is a representative sample in regards to frequency of main verbs and other linguistic features.
BERT-based models consuming only the input sentences show that they capture most of the complexity of NLI/factivity.
- Score: 0.07734726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite recent breakthroughs in Machine Learning for Natural Language
Processing, the Natural Language Inference (NLI) problems still constitute a
challenge. To this purpose we contribute a new dataset that focuses exclusively
on the factivity phenomenon; however, our task remains the same as other NLI
tasks, i.e. prediction of entailment, contradiction or neutral (ECN). The
dataset contains entirely natural language utterances in Polish and gathers
2,432 verb-complement pairs and 309 unique verbs. The dataset is based on the
National Corpus of Polish (NKJP) and is a representative sample in regards to
frequency of main verbs and other linguistic features (e.g. occurrence of
internal negation). We found that transformer BERT-based models working on
sentences obtained relatively good results ($\approx89\%$ F1 score). Even
though better results were achieved using linguistic features ($\approx91\%$ F1
score), this model requires more human labour (humans in the loop) because
features were prepared manually by expert linguists. BERT-based models
consuming only the input sentences show that they capture most of the
complexity of NLI/factivity. Complex cases in the phenomenon - e.g. cases with
entitlement (E) and non-factive verbs - remain an open issue for further
research.
Related papers
- ViANLI: Adversarial Natural Language Inference for Vietnamese [1.907126872483548]
We introduce the adversarial NLI dataset to the NLP research community with the name ViANLI.
This data set contains more than 10K premise-hypothesis pairs.
The accuracy of the most powerful model on the test set only reached 48.4%.
arXiv Detail & Related papers (2024-06-25T16:58:19Z) - Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets.
This survey delves into an important attribute of these datasets: the dialect of a language.
Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z) - A deep Natural Language Inference predictor without language-specific
training data [44.26507854087991]
We present a technique of NLP to tackle the problem of inference relation (NLI) between pairs of sentences in a target language of choice without a language-specific training dataset.
We exploit a generic translation dataset, manually translated, along with two instances of the same pre-trained model.
The model has been evaluated over machine translated Stanford NLI test dataset, machine translated Multi-Genre NLI test dataset, and manually translated RTE3-ITA test dataset.
arXiv Detail & Related papers (2023-09-06T10:20:59Z) - Towards preserving word order importance through Forced Invalidation [80.33036864442182]
We show that pre-trained language models are insensitive to word order.
We propose Forced Invalidation to help preserve the importance of word order.
Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
arXiv Detail & Related papers (2023-04-11T13:42:10Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Multi-Scales Data Augmentation Approach In Natural Language Inference
For Artifacts Mitigation And Pre-Trained Model Optimization [0.0]
We provide a variety of techniques for analyzing and locating dataset artifacts inside the crowdsourced Stanford Natural Language Inference corpus.
To mitigate dataset artifacts, we employ a unique multi-scale data augmentation technique with two distinct frameworks.
Our combination method enhances our model's resistance to perturbation testing, enabling it to continuously outperform the pre-trained baseline.
arXiv Detail & Related papers (2022-12-16T23:37:44Z) - ArNLI: Arabic Natural Language Inference for Entailment and
Contradiction Detection [1.8275108630751844]
We have created a data set of more than 12k sentences and named ArNLI, that will be publicly available.
We proposed an approach to detect contradictions between pairs of sentences in Arabic language using contradiction vector combined with language model vector as an input to machine learning model.
Best results achieved using Random Forest classifier with an accuracy of 99%, 60%, 75% on PHEME, SICK and ArNLI respectively.
arXiv Detail & Related papers (2022-09-28T09:37:16Z) - WANLI: Worker and AI Collaboration for Natural Language Inference
Dataset Creation [101.00109827301235]
We introduce a novel paradigm for dataset creation based on human and machine collaboration.
We use dataset cartography to automatically identify examples that demonstrate challenging reasoning patterns, and instruct GPT-3 to compose new examples with similar patterns.
The resulting dataset, WANLI, consists of 108,357 natural language inference (NLI) examples that present unique empirical strengths.
arXiv Detail & Related papers (2022-01-16T03:13:49Z) - Automatically Identifying Semantic Bias in Crowdsourced Natural Language
Inference Datasets [78.6856732729301]
We introduce a model-driven, unsupervised technique to find "bias clusters" in a learned embedding space of hypotheses in NLI datasets.
interventions and additional rounds of labeling can be performed to ameliorate the semantic bias of the hypothesis distribution of a dataset.
arXiv Detail & Related papers (2021-12-16T22:49:01Z) - NL-Augmenter: A Framework for Task-Sensitive Natural Language
Augmentation [91.97706178867439]
We present NL-Augmenter, a new participatory Python-based natural language augmentation framework.
We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks.
We demonstrate the efficacy of NL-Augmenter by using several of its transformations to analyze the robustness of popular natural language models.
arXiv Detail & Related papers (2021-12-06T00:37:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.