Intrinsic Probing through Dimension Selection
- URL: http://arxiv.org/abs/2010.02812v1
- Date: Tue, 6 Oct 2020 15:21:08 GMT
- Title: Intrinsic Probing through Dimension Selection
- Authors: Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell
- Abstract summary: Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks.
Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it.
In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted.
- Score: 69.52439198455438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most modern NLP systems make use of pre-trained contextual representations
that attain astonishingly high performance on a variety of tasks. Such high
performance should not be possible unless some form of linguistic structure
inheres in these representations, and a wealth of research has sprung up on
probing for it. In this paper, we draw a distinction between intrinsic probing,
which examines how linguistic information is structured within a
representation, and the extrinsic probing popular in prior work, which only
argues for the presence of such information by showing that it can be
successfully extracted. To enable intrinsic probing, we propose a novel
framework based on a decomposable multivariate Gaussian probe that allows us to
determine whether the linguistic information in word embeddings is dispersed or
focal. We then probe fastText and BERT for various morphosyntactic attributes
across 36 languages. We find that most attributes are reliably encoded by only
a few neurons, with fastText concentrating its linguistic structure more than
BERT.
Related papers
- Discovering Low-rank Subspaces for Language-agnostic Multilingual
Representations [38.56175462620892]
Large pretrained multilingual language models (ML-LMs) have shown remarkable capabilities of zero-shot cross-lingual transfer.
We present a novel view of projecting away language-specific factors from a multilingual embedding space.
We show that applying our method consistently leads to improvements over commonly used ML-LMs.
arXiv Detail & Related papers (2024-01-11T09:54:11Z) - Probing via Prompting [71.7904179689271]
This paper introduces a novel model-free approach to probing, by formulating probing as a prompting task.
We conduct experiments on five probing tasks and show that our approach is comparable or better at extracting information than diagnostic probes.
We then examine the usefulness of a specific linguistic property for pre-training by removing the heads that are essential to that property and evaluating the resulting model's performance on language modeling.
arXiv Detail & Related papers (2022-07-04T22:14:40Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - A Simple and Efficient Probabilistic Language model for Code-Mixed Text [0.0]
We present a simple probabilistic approach for building efficient word embedding for code-mixed text.
We examine its efficacy for the classification task using bidirectional LSTMs and SVMs.
arXiv Detail & Related papers (2021-06-29T05:37:57Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Linguistic Profiling of a Neural Language Model [1.0552465253379135]
We investigate the linguistic knowledge learned by a Neural Language Model (NLM) before and after a fine-tuning process.
We show that BERT is able to encode a wide range of linguistic characteristics, but it tends to lose this information when trained on specific downstream tasks.
arXiv Detail & Related papers (2020-10-05T09:09:01Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.