Related papers: Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models

Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models

URL: http://arxiv.org/abs/2509.21155v2
Date: Fri, 26 Sep 2025 02:37:41 GMT
Title: Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
Authors: Chantal Shaib, Vinith M. Suriyakumar, Levent Sagun, Byron C. Wallace, Marzyeh Ghassemi,
Abstract summary: syntactic templates -- frequent sequences of Part-of-Speech (PoS) tags -- are prevalent in training data and often appear in model outputs.<n>In this work we characterize syntactic templates, domain, and semantics in task-instruction pairs.
Score: 31.57185140334229
License: http://creativecommons.org/licenses/by/4.0/
Abstract: For an LLM to correctly respond to an instruction it must understand both the semantics and the domain (i.e., subject area) of a given task-instruction pair. However, syntax can also convey implicit information Recent work shows that syntactic templates -- frequent sequences of Part-of-Speech (PoS) tags -- are prevalent in training data and often appear in model outputs. In this work we characterize syntactic templates, domain, and semantics in task-instruction pairs. We identify cases of spurious correlations between syntax and domain, where models learn to associate a domain with syntax during training; this can sometimes override prompt semantics. Using a synthetic training dataset, we find that the syntactic-domain correlation can lower performance (mean 0.51 +/- 0.06) on entity knowledge tasks in OLMo-2 models (1B-13B). We introduce an evaluation framework to detect this phenomenon in trained models, and show that it occurs on a subset of the FlanV2 dataset in open (OLMo-2-7B; Llama-4-Maverick), and closed (GPT-4o) models. Finally, we present a case study on the implications for safety finetuning, showing that unintended syntactic-domain correlations can be used to bypass refusals in OLMo-2-7B Instruct and GPT-4o. Our findings highlight two needs: (1) to explicitly test for syntactic-domain correlations, and (2) to ensure syntactic diversity in training data, specifically within domains, to prevent such spurious correlations.

Related papers

Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models [64.58262227709842]
ARISE (Attention-weighted Representation with Integrated Semantic Embeddings) is presented.<n>It builds semantic-aware representations that complement the metric space of categorical data for accurate clustering.<n>Experiments on eight benchmark datasets demonstrate consistent improvements over seven representative counterparts.
arXiv Detail & Related papers (2026-01-03T11:37:46Z)
Beyond Memorization: Assessing Semantic Generalization in Large Language Models Using Phrasal Constructions [3.0906699069248806]
Construction Grammar (CxG) is a psycholinguistically grounded framework for testing generalization.<n>Our dataset consists of English phrasal constructions, for which speakers are known to be able to abstract over commonplace instantiations.<n>Our results demonstrate that state-of-the-art models, including GPT-o1, exhibit a performance drop of over 40% on our second task.
arXiv Detail & Related papers (2025-01-08T18:15:10Z)
Detection and Measurement of Syntactic Templates in Generated Text [58.111650675717414]
We offer an analysis of syntactic features to characterize general repetition in models. We find that models tend to produce templated text in downstream tasks at a higher rate than what is found in human-reference texts.
arXiv Detail & Related papers (2024-06-28T19:34:23Z)
A Hybrid Approach To Aspect Based Sentiment Analysis Using Transfer Learning [3.30307212568497]
We propose a hybrid approach for Aspect Based Sentiment Analysis using transfer learning. The approach focuses on generating weakly-supervised annotations by exploiting the strengths of both large language models (LLM) and traditional syntactic dependencies.
arXiv Detail & Related papers (2024-03-25T23:02:33Z)
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction [33.528688487954454]
Relation extraction (RE) aims to identify semantic relationships between entities within text. Few-shot learning, aiming to lessen annotation demands, typically provides incomplete and biased supervision for target relations. We introduce REPaL, comprising three stages: (1) We leverage large language models (LLMs) to generate initial seed instances from relation definitions and an unlabeled corpus.
arXiv Detail & Related papers (2024-02-17T00:20:06Z)
Exploiting Contextual Target Attributes for Target Sentiment Classification [53.30511968323911]
Existing PTLM-based models for TSC can be categorized into two groups: 1) fine-tuning-based models that adopt PTLM as the context encoder; 2) prompting-based models that transfer the classification task to the text/word generation task. We present a new perspective of leveraging PTLM for TSC: simultaneously leveraging the merits of both language modeling and explicit target-context interactions via contextual target attributes.
arXiv Detail & Related papers (2023-12-21T11:45:28Z)
Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions. This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z)
Graph Adaptive Semantic Transfer for Cross-domain Sentiment Classification [68.06496970320595]
Cross-domain sentiment classification (CDSC) aims to use the transferable semantics learned from the source domain to predict the sentiment of reviews in the unlabeled target domain. We present Graph Adaptive Semantic Transfer (GAST) model, an adaptive syntactic graph embedding method that is able to learn domain-invariant semantics from both word sequences and syntactic graphs.
arXiv Detail & Related papers (2022-05-18T07:47:01Z)
ASPER: Attention-based Approach to Extract Syntactic Patterns denoting Semantic Relations in Sentential Context [2.175490119265481]
We propose an attention-based supervised deep learning model, ASPER, which extracts syntactic patterns between entities exhibiting a given semantic relation in the sentential context. We validate the performance of ASPER on three distinct semantic relations -- hyponym-hypernym, cause-effect, and meronym-holonym on six datasets. For all these semantic relations, ASPER can automatically identify a collection of syntactic patterns reflecting the existence of such a relation between a pair of entities in a sentence.
arXiv Detail & Related papers (2021-04-04T02:36:19Z)
Temporal Embeddings and Transformer Models for Narrative Text Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling. The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time. A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.