DirectProbe: Studying Representations without Classifiers
- URL: http://arxiv.org/abs/2104.05904v1
- Date: Tue, 13 Apr 2021 02:40:26 GMT
- Title: DirectProbe: Studying Representations without Classifiers
- Authors: Yichu Zhou and Vivek Srikumar
- Abstract summary: DirectProbe studies the geometry of a representation by building upon the notion of a version space for a task.
Experiments with several linguistic tasks and contextualized embeddings show that, even without training classifiers, DirectProbe can shine light into how an embedding space represents labels.
- Score: 21.23284793831221
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding how linguistic structures are encoded in contextualized
embedding could help explain their impressive performance across NLP@. Existing
approaches for probing them usually call for training classifiers and use the
accuracy, mutual information, or complexity as a proxy for the representation's
goodness. In this work, we argue that doing so can be unreliable because
different representations may need different classifiers. We develop a
heuristic, DirectProbe, that directly studies the geometry of a representation
by building upon the notion of a version space for a task. Experiments with
several linguistic tasks and contextualized embeddings show that, even without
training classifiers, DirectProbe can shine light into how an embedding space
represents labels, and also anticipate classifier performance for the
representation.
Related papers
- Representation Of Lexical Stylistic Features In Language Models'
Embedding Space [28.60690854046176]
We show that it is possible to derive a vector representation for each of these stylistic notions from only a small number of seed pairs.
We conduct experiments on five datasets and find that static embeddings encode these features more accurately at the level of words and phrases.
The lower performance of contextualized representations at the word level is partially attributable to the anisotropy of their vector space.
arXiv Detail & Related papers (2023-05-29T23:44:26Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - High-dimensional distributed semantic spaces for utterances [0.2907403645801429]
This paper describes a model for high-dimensional representation for utterance and text level data.
It is based on a mathematically principled and behaviourally plausible approach to representing linguistic information.
The paper shows how the implemented model is able to represent a broad range of linguistic features in a common integral framework of fixed dimensionality.
arXiv Detail & Related papers (2021-04-01T12:09:47Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present our approach of constructing analogy datasets in terms of words, phrases and sentences.
We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z) - Predicting What You Already Know Helps: Provable Self-Supervised
Learning [60.27658820909876]
Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data.
We show a mechanism exploiting the statistical connections between certain em reconstruction-based pretext tasks that guarantee to learn a good representation.
We prove the linear layer yields small approximation error even for complex ground truth function class.
arXiv Detail & Related papers (2020-08-03T17:56:13Z) - Structured (De)composable Representations Trained with Neural Networks [21.198279941828112]
A template representation refers to the generic representation that captures the characteristics of an entire class.
The proposed technique uses end-to-end deep learning to learn structured and composable representations from input images and discrete labels.
We prove that the representations have a clear structure allowing to decompose the representation into factors that represent classes and environments.
arXiv Detail & Related papers (2020-07-07T10:20:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.