Explicitly Modeling Syntax in Language Models with Incremental Parsing
and a Dynamic Oracle
- URL: http://arxiv.org/abs/2011.07960v2
- Date: Mon, 10 May 2021 18:13:41 GMT
- Title: Explicitly Modeling Syntax in Language Models with Incremental Parsing
and a Dynamic Oracle
- Authors: Yikang Shen, Shawn Tan, Alessandro Sordoni, Siva Reddy, Aaron
Courville
- Abstract summary: We propose a new syntax-aware language model: Syntactic Ordered Memory (SOM)
The model explicitly models the structure with an incremental and maintains the conditional probability setting of a standard language model.
Experiments show that SOM can achieve strong results in language modeling, incremental parsing and syntactic generalization tests.
- Score: 88.65264818967489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Syntax is fundamental to our thinking about language. Failing to capture the
structure of input language could lead to generalization problems and
over-parametrization. In the present work, we propose a new syntax-aware
language model: Syntactic Ordered Memory (SOM). The model explicitly models the
structure with an incremental parser and maintains the conditional probability
setting of a standard language model (left-to-right). To train the incremental
parser and avoid exposure bias, we also propose a novel dynamic oracle, so that
SOM is more robust to wrong parsing decisions. Experiments show that SOM can
achieve strong results in language modeling, incremental parsing and syntactic
generalization tests, while using fewer parameters than other models.
Related papers
- Using Grammar Masking to Ensure Syntactic Validity in LLM-based Modeling Tasks [0.996023506058745]
Grammar masking is used to guide large language models toward producing syntactically correct models for a given context-free grammar.
We show that grammar masking can dramatically improve the modeling capabilities of several language models.
arXiv Detail & Related papers (2024-07-08T17:19:59Z) - Split and Rephrase with Large Language Models [2.499907423888049]
Split and Rephrase (SPRP) task consists in splitting complex sentences into a sequence of shorter grammatical sentences.
We evaluate large language models on the task, showing that they can provide large improvements over the state of the art on the main metrics.
arXiv Detail & Related papers (2023-12-18T10:16:37Z) - Benchmarking Language Models for Code Syntax Understanding [79.11525961219591]
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding.
In this work, we perform the first thorough benchmarking of the state-of-the-art pre-trained models for identifying the syntactic structures of programs.
Our findings point out key limitations of existing pre-training methods for programming languages, and suggest the importance of modeling code syntactic structures.
arXiv Detail & Related papers (2022-10-26T04:47:18Z) - Language Model Cascades [72.18809575261498]
Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities.
Cases with control flow and dynamic structure require techniques from probabilistic programming.
We formalize several existing techniques from this perspective, including scratchpads / chain of thought, verifiers, STaR, selection-inference, and tool use.
arXiv Detail & Related papers (2022-07-21T07:35:18Z) - BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and
Semantic Parsing [55.058258437125524]
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing.
We benchmark eight language models, including two GPT-3 variants available only through an API.
Our experiments show that encoder-decoder pretrained language models can achieve similar performance or surpass state-of-the-art methods for syntactic and semantic parsing when the model output is constrained to be valid.
arXiv Detail & Related papers (2022-06-21T18:34:11Z) - Multilingual Generative Language Models for Zero-Shot Cross-Lingual
Event Argument Extraction [80.61458287741131]
We present a study on leveraging multilingual pre-trained generative language models for zero-shot cross-lingual event argument extraction (EAE)
By formulating EAE as a language generation task, our method effectively encodes event structures and captures the dependencies between arguments.
Our proposed model finetunes multilingual pre-trained generative language models to generate sentences that fill in the language-agnostic template with arguments extracted from the input passage.
arXiv Detail & Related papers (2022-03-15T23:00:32Z) - Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics.
The goal in semantic parsing is to generate a structured meaning representation given a natural language input.
We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.