How to Probe Sentence Embeddings in Low-Resource Languages: On
Structural Design Choices for Probing Task Evaluation
- URL: http://arxiv.org/abs/2006.09109v2
- Date: Wed, 28 Oct 2020 12:38:37 GMT
- Title: How to Probe Sentence Embeddings in Low-Resource Languages: On
Structural Design Choices for Probing Task Evaluation
- Authors: Steffen Eger and Johannes Daxenberger and Iryna Gurevych
- Abstract summary: We investigate sensitivity of probing task results to structural design choices.
We probe embeddings in a multilingual setup with design choices that lie in a'stable region', as we identify for English.
We find that results on English do not transfer to other languages.
- Score: 82.96358326053115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentence encoders map sentences to real valued vectors for use in downstream
applications. To peek into these representations - e.g., to increase
interpretability of their results - probing tasks have been designed which
query them for linguistic knowledge. However, designing probing tasks for
lesser-resourced languages is tricky, because these often lack large-scale
annotated data or (high-quality) dependency parsers as a prerequisite of
probing task design in English. To investigate how to probe sentence embeddings
in such cases, we investigate sensitivity of probing task results to structural
design choices, conducting the first such large scale study. We show that
design choices like size of the annotated probing dataset and type of
classifier used for evaluation do (sometimes substantially) influence probing
outcomes. We then probe embeddings in a multilingual setup with design choices
that lie in a 'stable region', as we identify for English, and find that
results on English do not transfer to other languages. Fairer and more
comprehensive sentence-level probing evaluation should thus be carried out on
multiple languages in the future.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.