Hitting "Probe"rty with Non-Linearity, and More
- URL: http://arxiv.org/abs/2402.16168v1
- Date: Sun, 25 Feb 2024 18:33:25 GMT
- Title: Hitting "Probe"rty with Non-Linearity, and More
- Authors: Avik Pal, Madhura Pawar
- Abstract summary: We reformulate the design of non-linear structural probes making their design simpler yet effective.
We qualitatively assess how strongly two words in a sentence are connected in the predicted dependency tree.
We find that the radial basis function (RBF) is an effective non-linear probe for the BERT model.
- Score: 2.1756081703276
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Structural probes learn a linear transformation to find how dependency trees
are embedded in the hidden states of language models. This simple design may
not allow for full exploitation of the structure of the encoded information.
Hence, to investigate the structure of the encoded information to its full
extent, we incorporate non-linear structural probes. We reformulate the design
of non-linear structural probes introduced by White et al. making its design
simpler yet effective. We also design a visualization framework that lets us
qualitatively assess how strongly two words in a sentence are connected in the
predicted dependency tree. We use this technique to understand which non-linear
probe variant is good at encoding syntactical information. Additionally, we
also use it to qualitatively investigate the structure of dependency trees that
BERT encodes in each of its layers. We find that the radial basis function
(RBF) is an effective non-linear probe for the BERT model than the linear
probe.
Related papers
- Learning to Model Graph Structural Information on MLPs via Graph Structure Self-Contrasting [50.181824673039436]
We propose a Graph Structure Self-Contrasting (GSSC) framework that learns graph structural information without message passing.
The proposed framework is based purely on Multi-Layer Perceptrons (MLPs), where the structural information is only implicitly incorporated as prior knowledge.
It first applies structural sparsification to remove potentially uninformative or noisy edges in the neighborhood, and then performs structural self-contrasting in the sparsified neighborhood to learn robust node representations.
arXiv Detail & Related papers (2024-09-09T12:56:02Z) - On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL [8.57550491437633]
This work investigates the linear handling of structured data in encoder-decoder language models, specifically T5.
Our findings reveal the model's ability to mimic human-designed processes such as schema linking and syntax prediction.
We also uncover insights into the model's internal mechanisms, including the ego-centric nature of structure node encodings.
arXiv Detail & Related papers (2024-04-03T01:16:20Z) - StrAE: Autoencoding for Pre-Trained Embeddings using Explicit Structure [5.2869308707704255]
StrAE is a Structured Autoencoder framework that through strict adherence to explicit structure, enables effective learning of multi-level representations.
We show that our results are directly attributable to the informativeness of the structure provided as input, and show that this is not the case for existing tree models.
We then extend StrAE to allow the model to define its own compositions using a simple localised-merge algorithm.
arXiv Detail & Related papers (2023-05-09T16:20:48Z) - LasUIE: Unifying Information Extraction with Latent Adaptive
Structure-aware Generative Language Model [96.889634747943]
Universally modeling all typical information extraction tasks (UIE) with one generative language model (GLM) has revealed great potential.
We propose a novel structure-aware GLM, fully unleashing the power of syntactic knowledge for UIE.
Over 12 IE benchmarks across 7 tasks our system shows significant improvements over the baseline UIE system.
arXiv Detail & Related papers (2023-04-13T04:01:14Z) - SE-GSL: A General and Effective Graph Structure Learning Framework
through Structural Entropy Optimization [67.28453445927825]
Graph Neural Networks (GNNs) are de facto solutions to structural data learning.
Existing graph structure learning (GSL) frameworks still lack robustness and interpretability.
This paper proposes a general GSL framework, SE-GSL, through structural entropy and the graph hierarchy abstracted in the encoding tree.
arXiv Detail & Related papers (2023-03-17T05:20:24Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - A Non-Linear Structural Probe [43.50268085775569]
We study the case of a structural probe, which aims to investigate the encoding of syntactic structure in contextual representations.
By observing that the structural probe learns a metric, we are able to kernelize it and develop a novel non-linear variant.
We test on 6 languages and find that the radial-basis function (RBF) kernel, in conjunction with regularization, achieves a statistically significant improvement.
arXiv Detail & Related papers (2021-05-21T07:53:10Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Variational Autoencoder with Learned Latent Structure [4.41370484305827]
We introduce the Variational Autoencoder with Learned Latent Structure (VAELLS)
VAELLS incorporates a learnable manifold model into the latent space of a VAE.
We validate our model on examples with known latent structure and also demonstrate its capabilities on a real-world dataset.
arXiv Detail & Related papers (2020-06-18T14:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.