Introducing Orthogonal Constraint in Structural Probes
- URL: http://arxiv.org/abs/2012.15228v1
- Date: Wed, 30 Dec 2020 17:14:25 GMT
- Title: Introducing Orthogonal Constraint in Structural Probes
- Authors: Tomasz Limisiewicz and David Mare\v{c}ek
- Abstract summary: We decompose a linear projection of language vector space into isomorphic space rotation and linear scaling directions.
We experimentally show that our approach can be performed in a multitask setting.
- Score: 0.2538209532048867
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With the recent success of pre-trained models in NLP, a significant focus was
put on interpreting their representations. One of the most prominent approaches
is structural probing (Hewitt and Manning, 2019), where a linear projection of
language vector space is performed in order to approximate the topology of
linguistic structures. In this work, we decompose this mapping into 1.
isomorphic space rotation; 2. linear scaling that identifies and scales the
most relevant directions. We introduce novel structural tasks to exam our
method's ability to disentangle information hidden in the embeddings. We
experimentally show that our approach can be performed in a multitask setting.
Moreover, the orthogonal constraint identifies embedding subspaces encoding
specific linguistic features and make the probe less vulnerable to
memorization.
Related papers
- Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces.
We directly leverage natural language prompts and image captions to map latent directions.
Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z) - N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields [112.02885337510716]
Nested Neural Feature Fields (N2F2) is a novel approach that employs hierarchical supervision to learn a single feature field.
We leverage a 2D class-agnostic segmentation model to provide semantically meaningful pixel groupings at arbitrary scales in the image space.
Our approach outperforms the state-of-the-art feature field distillation methods on tasks such as open-vocabulary 3D segmentation and localization.
arXiv Detail & Related papers (2024-03-16T18:50:44Z) - Investigating semantic subspaces of Transformer sentence embeddings
through linear structural probing [2.5002227227256864]
We present experiments with semantic structural probing, a method for studying sentence-level representations.
We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks.
We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.
arXiv Detail & Related papers (2023-10-18T12:32:07Z) - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions.
Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Cross-Lingual BERT Contextual Embedding Space Mapping with Isotropic and
Isometric Conditions [7.615096161060399]
We investigate a context-aware and dictionary-free mapping approach by leveraging parallel corpora.
Our findings unfold the tight relationship between isotropy, isometry, and isomorphism in normalized contextual embedding spaces.
arXiv Detail & Related papers (2021-07-19T22:57:36Z) - A Non-Linear Structural Probe [43.50268085775569]
We study the case of a structural probe, which aims to investigate the encoding of syntactic structure in contextual representations.
By observing that the structural probe learns a metric, we are able to kernelize it and develop a novel non-linear variant.
We test on 6 languages and find that the radial-basis function (RBF) kernel, in conjunction with regularization, achieves a statistically significant improvement.
arXiv Detail & Related papers (2021-05-21T07:53:10Z) - The Low-Dimensional Linear Geometry of Contextualized Word
Representations [27.50785941238007]
We study the linear geometry of contextualized word representations in ELMO and BERT.
We show that a variety of linguistic features are encoded in low-dimensional subspaces.
arXiv Detail & Related papers (2021-05-15T00:58:08Z) - Intrinsic Probing through Dimension Selection [69.52439198455438]
Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks.
Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it.
In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted.
arXiv Detail & Related papers (2020-10-06T15:21:08Z) - LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon
Induction Through Non-Linear Mapping in Latent Space [17.49073364781107]
We propose a novel semi-supervised method to learn cross-lingual word embeddings for bilingual lexicon induction.
Our model is independent of the isomorphic assumption and uses nonlinear mapping in the latent space of two independently trained auto-encoders.
arXiv Detail & Related papers (2020-04-28T23:28:26Z) - Deep Metric Structured Learning For Facial Expression Recognition [58.7528672474537]
We propose a deep metric learning model to create embedded sub-spaces with a well defined structure.
A new loss function that imposes Gaussian structures on the output space is introduced to create these sub-spaces.
We experimentally demonstrate that the learned embedding can be successfully used for various applications including expression retrieval and emotion recognition.
arXiv Detail & Related papers (2020-01-18T06:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.