Syntax-informed Question Answering with Heterogeneous Graph Transformer
- URL: http://arxiv.org/abs/2204.09655v1
- Date: Fri, 1 Apr 2022 07:48:03 GMT
- Title: Syntax-informed Question Answering with Heterogeneous Graph Transformer
- Authors: Fangyi Zhu, Lok You Tan, See-Kiong Ng, St\'ephane Bressan
- Abstract summary: We present a linguistics-informed question answering approach that extends and fine-tunes a pre-trained neural language model.
We illustrate the approach by the addition of syntactic information in the form of dependency and constituency graphic structures connecting tokens and virtual tokens.
- Score: 2.139714421848487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large neural language models are steadily contributing state-of-the-art
performance to question answering and other natural language and information
processing tasks. These models are expensive to train. We propose to evaluate
whether such pre-trained models can benefit from the addition of explicit
linguistics information without requiring retraining from scratch.
We present a linguistics-informed question answering approach that extends
and fine-tunes a pre-trained transformer-based neural language model with
symbolic knowledge encoded with a heterogeneous graph transformer. We
illustrate the approach by the addition of syntactic information in the form of
dependency and constituency graphic structures connecting tokens and virtual
vertices.
A comparative empirical performance evaluation with BERT as its baseline and
with Stanford Question Answering Dataset demonstrates the competitiveness of
the proposed approach. We argue, in conclusion and in the light of further
results of preliminary experiments, that the approach is extensible to further
linguistics information including semantics and pragmatics.
Related papers
- Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Transformer Grammars: Augmenting Transformer Language Models with
Syntactic Inductive Biases at Scale [31.293175512404172]
We introduce Transformer Grammars -- a class of Transformer language models that combine expressive power, scalability, and strong performance of Transformers.
We find that Transformer Grammars outperform various strong baselines on multiple syntax-sensitive language modeling evaluation metrics.
arXiv Detail & Related papers (2022-03-01T17:22:31Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Zero-shot Commonsense Question Answering with Cloze Translation and
Consistency Optimization [20.14487209460865]
We investigate four translation methods that can translate natural questions into cloze-style sentences.
We show that our methods are complementary datasets to a knowledge base improved model, and combining them can lead to state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2022-01-01T07:12:49Z) - Oracle Linguistic Graphs Complement a Pretrained Transformer Language
Model: A Cross-formalism Comparison [13.31232311913236]
We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling.
We find that, overall, semantic constituency structures are most useful to language modeling performance.
arXiv Detail & Related papers (2021-12-15T04:29:02Z) - Learning Better Sentence Representation with Syntax Information [0.0]
We propose a novel approach to combining syntax information with a pre-trained language model.
Our model achieves 91.2% accuracy, outperforming the baseline model by 37.8% on sentence completion task.
arXiv Detail & Related papers (2021-01-09T12:15:08Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Data Augmentation for Spoken Language Understanding via Pretrained
Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.