Related papers: A Systematic Study of Compositional Syntactic Transformer Language Models

A Systematic Study of Compositional Syntactic Transformer Language Models

URL: http://arxiv.org/abs/2506.22978v1
Date: Sat, 28 Jun 2025 18:32:23 GMT
Title: A Systematic Study of Compositional Syntactic Transformer Language Models
Authors: Yida Zhao, Hao Xve, Xiang Hu, Kewei Tu,
Abstract summary: This paper focuses on compositional SLMs that are based on constituency parse trees and contain explicit bottom-up composition of constituent representations.<n>We identify key aspects of design choices in existing compositional SLMs and propose a unified framework encompassing both existing models and novel variants.
Score: 37.38087762297668
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Syntactic language models (SLMs) enhance Transformers by incorporating syntactic biases through the modeling of linearized syntactic parse trees alongside surface sentences. This paper focuses on compositional SLMs that are based on constituency parse trees and contain explicit bottom-up composition of constituent representations. We identify key aspects of design choices in existing compositional SLMs and propose a unified framework encompassing both existing models and novel variants. We conduct a comprehensive empirical evaluation of all the variants in our framework across language modeling, syntactic generalization, summarization, dialogue, and inference efficiency. Based on the experimental results, we make multiple recommendations on the design of compositional SLMs. Our code is released at https://github.com/zhaoyd1/compositional_SLMs.

Related papers

Towards a Comparative Framework for Compositional AI Models [0.0]
We show how models can learn to compositionally generalise using the DisCoCirc framework for natural language processing.<n>We compare both quantum circuit based models, as well as classical neural networks, on a dataset derived from one of the bAbI tasks.<n>Both architectures score within 5% of one another on the productivity and substitutivity tasks, but differ by at least 10% for the systematicity task.
arXiv Detail & Related papers (2025-06-27T15:59:14Z)
Consistency of Compositional Generalization across Multiple Levels [31.77432446850103]
We propose a meta-learning based framework, for achieving consistent compositional generalization across multiple levels.<n>We build a GQA-CCG dataset to quantitatively evaluate the consistency.
arXiv Detail & Related papers (2024-12-18T09:09:41Z)
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback [50.84142264245052]
This work introduces the Align-SLM framework to enhance the semantic understanding of textless Spoken Language Models (SLMs)<n>Our approach generates multiple speech continuations from a given prompt and uses semantic metrics to create preference data for Direct Preference Optimization (DPO)<n>We evaluate the framework using ZeroSpeech 2021 benchmarks for lexical and syntactic modeling, the spoken version of the StoryCloze dataset for semantic coherence, and other speech generation metrics, including the GPT4-o score and human evaluation.
arXiv Detail & Related papers (2024-11-04T06:07:53Z)
Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection [19.610781457283966]
We introduce a novel method for enhancing the compositional understanding of vision-language (VL) models in language-based object detection. Our framework generates densely paired positive and negative triplets (image, text descriptions, and bounding boxes) in both image and text domains. We propose a new compositional contrastive learning formulation that discovers semantics and structures in complex descriptions from synthetic triplets.
arXiv Detail & Related papers (2024-07-21T23:43:24Z)
Linguistic Structure Induction from Language Models [1.8130068086063336]
This thesis focuses on producing constituency and dependency structures from Language Models (LMs) in an unsupervised setting. I present a detailed study on StructFormer (SF) which retrofits a transformer architecture with a encoder network to produce constituency and dependency structures. I present six experiments to analyze and address this field's challenges.
arXiv Detail & Related papers (2024-03-11T16:54:49Z)
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs) We first present a framework for understanding compositional structures from a geometric perspective. We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z)
Syntax-guided Neural Module Distillation to Probe Compositionality in Sentence Embeddings [0.0]
We construct a neural module net based on its syntax parse and train it end-to-end to approximate the sentence's embedding. We find differences in the distillability of various sentence embedding models that broadly correlate with their performance. Preliminary evidence that much syntax-guided composition in sentence embedding models is linear.
arXiv Detail & Related papers (2023-01-21T19:42:02Z)
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale [31.293175512404172]
We introduce Transformer Grammars -- a class of Transformer language models that combine expressive power, scalability, and strong performance of Transformers. We find that Transformer Grammars outperform various strong baselines on multiple syntax-sensitive language modeling evaluation metrics.
arXiv Detail & Related papers (2022-03-01T17:22:31Z)
Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus. We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z)
Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization [131.23966358405767]
We adapt TP-TRANSFORMER with the explicitly compositional Product Representation (TPR) for the task of abstractive summarization. Key feature of our model is a structural bias that we introduce by encoding two separate representations for each token. We show that our TP-TRANSFORMER outperforms the Transformer and the original TP-TRANSFORMER significantly on several abstractive summarization datasets.
arXiv Detail & Related papers (2021-06-02T17:32:33Z)
Syntactic representation learning for neural network based TTS with syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information. Experimental results demonstrate the effectiveness of our proposed approach. For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.