Related papers: Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation

Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation

URL: http://arxiv.org/abs/2008.06788v2
Date: Wed, 28 Apr 2021 12:05:00 GMT
Title: Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation
Authors: Goran Glava\v{s} and Ivan Vuli\'c
Abstract summary: Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU) Recent advent of end-to-end neural models, self-supervised via language modeling (LM), and their success on a wide range of LU tasks, questions this belief. We empirically investigate the usefulness of supervised parsing for semantic LU in the context of LM-pretrained transformer networks.
Score: 71.70562795158625
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU). The recent advent of end-to-end neural models, self-supervised via language modeling (LM), and their success on a wide range of LU tasks, however, questions this belief. In this work, we empirically investigate the usefulness of supervised parsing for semantic LU in the context of LM-pretrained transformer networks. Relying on the established fine-tuning paradigm, we first couple a pretrained transformer with a biaffine parsing head, aiming to infuse explicit syntactic knowledge from Universal Dependencies treebanks into the transformer. We then fine-tune the model for LU tasks and measure the effect of the intermediate parsing training (IPT) on downstream LU task performance. Results from both monolingual English and zero-shot language transfer experiments (with intermediate target-language parsing) show that explicit formalized syntax, injected into transformers through IPT, has very limited and inconsistent effect on downstream LU performance. Our results, coupled with our analysis of transformers' representation spaces before and after intermediate parsing, make a significant step towards providing answers to an essential question: how (un)availing is supervised parsing for high-level semantic natural language understanding in the era of large neural models?

Related papers

Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation [7.376832526909754]
Large Language Models (LLMs) demonstrate strong reasoning capabilities for many tasks, often by explicitly decomposing the task via Chain-of-Thought (CoT) reasoning.<n>textitTranslating Step-by-stepcitepbriakou2024translating, for instance, introduces a multi-step prompt with decomposition and refinement of translation with LLMs.<n>We show that simply prompting LLMs to translate again'' yields even better results than human-like step-by-step prompting.
arXiv Detail & Related papers (2025-06-05T00:04:39Z)
Parameter-Efficient Transformer Embeddings [0.0]
We propose an alternative approach in which token embedding vectors are first generated deterministically, directly from the token IDs.<n>We train standard transformers and our architecture on natural language inference tasks.<n>Our results demonstrate that the proposed method competitive performance using significantly fewer parameters, trains faster, and operates effectively without the need for dropout.
arXiv Detail & Related papers (2025-05-04T21:47:18Z)
TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks. We propose the TasTe framework, which stands for translating through self-reflection. The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z)
Deep Natural Language Feature Learning for Interpretable Prediction [1.6114012813668932]
We propose a method to break down a main complex task into a set of intermediary easier sub-tasks. Our method allows for representing each example by a vector consisting of the answers to these questions. We have successfully applied this method to two completely different tasks: detecting incoherence in students' answers to open-ended mathematics exam questions, and screening abstracts for a systematic literature review of scientific papers on climate change and agroecology.
arXiv Detail & Related papers (2023-11-09T21:43:27Z)
Towards Effective Disambiguation for Machine Translation with Large Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences" Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z)
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation [13.352795145385645]
Speech translation (ST) is a good means of pretraining speech models for end-to-end spoken language understanding. We show that our models reach higher performance over baselines on monolingual and multilingual intent classification. We also create new benchmark datasets for speech summarization and low-resource/zero-shot transfer from English to French or Spanish.
arXiv Detail & Related papers (2023-05-16T17:53:03Z)
Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence [45.9949173746044]
We show that large-size pre-trained language models (PLMs) do not satisfy the logical negation property (LNP) We propose a novel intermediate training task, names meaning-matching, designed to directly learn a meaning-text correspondence. We find that the task enables PLMs to learn lexical semantic information.
arXiv Detail & Related papers (2022-05-08T08:37:36Z)
SML: a new Semantic Embedding Alignment Transformer for efficient cross-lingual Natural Language Inference [71.57324258813674]
The ability of Transformers to perform with precision a variety of tasks such as question answering, Natural Language Inference (NLI) or summarising, have enable them to be ranked as one of the best paradigms to address this kind of tasks at present. NLI is one of the best scenarios to test these architectures, due to the knowledge required to understand complex sentences and established a relation between a hypothesis and a premise. In this paper, we propose a new architecture, siamese multilingual transformer, to efficiently align multilingual embeddings for Natural Language Inference.
arXiv Detail & Related papers (2021-03-17T13:23:53Z)
Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models. We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
Adapting Pretrained Transformer to Lattices for Spoken Language Understanding [39.50831917042577]
It is shown that encoding lattices as opposed to 1-best results generated by automatic speech recognizer (ASR) boosts the performance of spoken language understanding (SLU) This paper aims at adapting pretrained transformers to lattice inputs in order to perform understanding tasks specifically for spoken language.
arXiv Detail & Related papers (2020-11-02T07:14:34Z)
Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining. We distill the approximate marginal distribution over words in context from the syntactic LM. Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.