Oracle Linguistic Graphs Complement a Pretrained Transformer Language
Model: A Cross-formalism Comparison
- URL: http://arxiv.org/abs/2112.07874v1
- Date: Wed, 15 Dec 2021 04:29:02 GMT
- Title: Oracle Linguistic Graphs Complement a Pretrained Transformer Language
Model: A Cross-formalism Comparison
- Authors: Jakob Prange, Nathan Schneider, Lingpeng Kong
- Abstract summary: We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling.
We find that, overall, semantic constituency structures are most useful to language modeling performance.
- Score: 13.31232311913236
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We examine the extent to which, in principle, linguistic graph
representations can complement and improve neural language modeling. With an
ensemble setup consisting of a pretrained Transformer and ground-truth graphs
from one of 7 different formalisms, we find that, overall, semantic
constituency structures are most useful to language modeling performance --
outpacing syntactic constituency structures as well as syntactic and semantic
dependency structures. Further, effects vary greatly depending on
part-of-speech class. In sum, our findings point to promising tendencies in
neuro-symbolic language modeling and invite future research quantifying the
design choices made by different formalisms.
Related papers
- Analyzing The Language of Visual Tokens [48.62180485759458]
We take a natural-language-centric approach to analyzing discrete visual languages.
We show that higher token innovation drives greater entropy and lower compression, with tokens predominantly representing object parts.
We also show that visual languages lack cohesive grammatical structures, leading to higher perplexity and weaker hierarchical organization compared to natural languages.
arXiv Detail & Related papers (2024-11-07T18:59:28Z) - Contextualized word senses: from attention to compositionality [0.10878040851637999]
We propose a transparent, interpretable, and linguistically motivated strategy for encoding the contextual sense of words.
Particular attention is given to dependency relations and semantic notions such as selection preferences and paradigmatic classes.
arXiv Detail & Related papers (2023-12-01T16:04:00Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Learning Disentangled Representations for Natural Language Definitions [0.0]
We argue that recurrent syntactic and semantic regularities in textual data can be used to provide the models with both structural biases and generative factors.
We leverage the semantic structures present in a representative and semantically dense category of sentence types, definitional sentences, for training a Variational Autoencoder to learn disentangled representations.
arXiv Detail & Related papers (2022-09-22T14:31:55Z) - Syntax-informed Question Answering with Heterogeneous Graph Transformer [2.139714421848487]
We present a linguistics-informed question answering approach that extends and fine-tunes a pre-trained neural language model.
We illustrate the approach by the addition of syntactic information in the form of dependency and constituency graphic structures connecting tokens and virtual tokens.
arXiv Detail & Related papers (2022-04-01T07:48:03Z) - Transformer Grammars: Augmenting Transformer Language Models with
Syntactic Inductive Biases at Scale [31.293175512404172]
We introduce Transformer Grammars -- a class of Transformer language models that combine expressive power, scalability, and strong performance of Transformers.
We find that Transformer Grammars outperform various strong baselines on multiple syntax-sensitive language modeling evaluation metrics.
arXiv Detail & Related papers (2022-03-01T17:22:31Z) - Syntactic Persistence in Language Models: Priming as a Window into
Abstract Language Representations [0.38498574327875945]
We investigate the extent to which modern, neural language models are susceptible to syntactic priming.
We introduce a novel metric and release Prime-LM, a large corpus where we control for various linguistic factors which interact with priming strength.
We report surprisingly strong priming effects when priming with multiple sentences, each with different words and meaning but with identical syntactic structure.
arXiv Detail & Related papers (2021-09-30T10:38:38Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space.
We present our approach of constructing analogy datasets in terms of words, phrases and sentences.
We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z) - Evaluating Transformer-Based Multilingual Text Classification [55.53547556060537]
We argue that NLP tools perform unequally across languages with different syntactic and morphological structures.
We calculate word order and morphological similarity indices to aid our empirical study.
arXiv Detail & Related papers (2020-04-29T03:34:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.