Topic Aware Probing: From Sentence Length Prediction to Idiom
Identification how reliant are Neural Language Models on Topic?
- URL: http://arxiv.org/abs/2403.02009v1
- Date: Mon, 4 Mar 2024 13:10:08 GMT
- Title: Topic Aware Probing: From Sentence Length Prediction to Idiom
Identification how reliant are Neural Language Models on Topic?
- Authors: Vasudevan Nedumpozhimana, John D. Kelleher
- Abstract summary: We study the relationship between Transformer-based models' (BERT and RoBERTa's) performance on a range of probing tasks in English.
Our results indicate that Transformer-based models encode both topic and non-topic information in their intermediate layers.
Our analysis of these models' performance on other standard probing tasks suggests that tasks that are relatively insensitive to the topic information are also tasks that are relatively difficult for these models.
- Score: 1.816169926868157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transformer-based Neural Language Models achieve state-of-the-art performance
on various natural language processing tasks. However, an open question is the
extent to which these models rely on word-order/syntactic or word
co-occurrence/topic-based information when processing natural language. This
work contributes to this debate by addressing the question of whether these
models primarily use topic as a signal, by exploring the relationship between
Transformer-based models' (BERT and RoBERTa's) performance on a range of
probing tasks in English, from simple lexical tasks such as sentence length
prediction to complex semantic tasks such as idiom token identification, and
the sensitivity of these tasks to the topic information. To this end, we
propose a novel probing method which we call topic-aware probing. Our initial
results indicate that Transformer-based models encode both topic and non-topic
information in their intermediate layers, but also that the facility of these
models to distinguish idiomatic usage is primarily based on their ability to
identify and encode topic. Furthermore, our analysis of these models'
performance on other standard probing tasks suggests that tasks that are
relatively insensitive to the topic information are also tasks that are
relatively difficult for these models.
Related papers
- Agentivit\`a e telicit\`a in GilBERTo: implicazioni cognitive [77.71680953280436]
The goal of this study is to investigate whether a Transformer-based neural language model infers lexical semantics.
The semantic properties considered are telicity (also combined with definiteness) and agentivity.
arXiv Detail & Related papers (2023-07-06T10:52:22Z) - Large Language Models Can Be Easily Distracted by Irrelevant Context [29.315230178997002]
We investigate how the model problem-solving accuracy can be influenced by irrelevant context.
We use benchmark to measure the distractibility of cutting-edge prompting techniques for large language models.
arXiv Detail & Related papers (2023-01-31T20:48:57Z) - Topics in Contextualised Attention Embeddings [7.6650522284905565]
Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation.
The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics.
Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters.
arXiv Detail & Related papers (2023-01-11T07:26:19Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z) - Did the Cat Drink the Coffee? Challenging Transformers with Generalized
Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit
Our results show that TLMs can reach performances that are comparable to those achieved by SDM.
However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing.
We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z) - Probing the Probing Paradigm: Does Probing Accuracy Entail Task
Relevance? [27.64235687067883]
We show that models can learn to encode linguistic properties even if they are not needed for the task on which the model was trained.
We demonstrate models can encode these properties considerably above chance-level even when distributed in the data as random noise.
arXiv Detail & Related papers (2020-05-02T06:19:20Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.