Abstraction not Memory: BERT and the English Article System
- URL: http://arxiv.org/abs/2206.04184v1
- Date: Wed, 8 Jun 2022 22:36:54 GMT
- Title: Abstraction not Memory: BERT and the English Article System
- Authors: Harish Tayyar Madabushi, Dagmar Divjak, Petar Milin
- Abstract summary: We compare the performance of native English speakers and pre-trained models on the task of article prediction set up as a three way choice (a/an, the, zero)
Our experiments with BERT show that BERT outperforms humans on this task across all articles.
- Score: 1.1064955465386002
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Article prediction is a task that has long defied accurate linguistic
description. As such, this task is ideally suited to evaluate models on their
ability to emulate native-speaker intuition. To this end, we compare the
performance of native English speakers and pre-trained models on the task of
article prediction set up as a three way choice (a/an, the, zero). Our
experiments with BERT show that BERT outperforms humans on this task across all
articles. In particular, BERT is far superior to humans at detecting the zero
article, possibly because we insert them using rules that the deep neural model
can easily pick up. More interestingly, we find that BERT tends to agree more
with annotators than with the corpus when inter-annotator agreement is high but
switches to agreeing more with the corpus as inter-annotator agreement drops.
We contend that this alignment with annotators, despite being trained on the
corpus, suggests that BERT is not memorising article use, but captures a high
level generalisation of article use akin to human intuition.
Related papers
- Subject Verb Agreement Error Patterns in Meaningless Sentences: Humans
vs. BERT [64.40111510974957]
We test whether meaning interferes with subject-verb number agreement in English.
We generate semantically well-formed and nonsensical items.
We find that BERT and humans are both sensitive to our semantic manipulation.
arXiv Detail & Related papers (2022-09-21T17:57:23Z) - PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS [27.20479869682578]
PnG BERT is a new encoder model for neural TTS.
It can be pre-trained on a large text corpus in a self-supervised manner.
arXiv Detail & Related papers (2021-03-28T06:24:00Z) - On the Sentence Embeddings from Pre-trained Language Models [78.45172445684126]
In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited.
We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity.
We propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective.
arXiv Detail & Related papers (2020-11-02T13:14:57Z) - Understanding Pre-trained BERT for Aspect-based Sentiment Analysis [71.40586258509394]
This paper analyzes the pre-trained hidden representations learned from reviews on BERT for tasks in aspect-based sentiment analysis (ABSA)
It is not clear how the general proxy task of (masked) language model trained on unlabeled corpus without annotations of aspects or opinions can provide important features for downstream tasks in ABSA.
arXiv Detail & Related papers (2020-10-31T02:21:43Z) - Deep Clustering of Text Representations for Supervision-free Probing of
Syntax [51.904014754864875]
We consider part of speech induction (POSI) and constituency labelling (CoLab) in this work.
We find that Multilingual BERT (mBERT) contains surprising amount of syntactic knowledge of English.
We report competitive performance of our probe on 45-tag English POSI, state-of-the-art performance on 12-tag POSI across 10 languages, and competitive results on CoLab.
arXiv Detail & Related papers (2020-10-24T05:06:29Z) - "Listen, Understand and Translate": Triple Supervision Decouples
End-to-end Speech-to-text Translation [49.610188741500274]
An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs the text in a target language.
Existing methods are limited by the amount of parallel corpus.
We build a system to fully utilize signals in a parallel ST corpus.
arXiv Detail & Related papers (2020-09-21T09:19:07Z) - CERT: Contrastive Self-supervised Learning for Language Understanding [20.17416958052909]
We propose CERT: Contrastive self-supervised Representations from Transformers.
CERT pretrains language representation models using contrastive self-supervised learning at the sentence level.
We evaluate CERT on 11 natural language understanding tasks in the GLUE benchmark where CERT outperforms BERT on 7 tasks, achieves the same performance as BERT on 2 tasks, and performs worse than BERT on 2 tasks.
arXiv Detail & Related papers (2020-05-16T16:20:38Z) - Incorporating BERT into Neural Machine Translation [251.54280200353674]
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence.
We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets.
arXiv Detail & Related papers (2020-02-17T08:13:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.