X-Blocks: Linguistic Building Blocks of Natural Language Explanations for Automated Vehicles
- URL: http://arxiv.org/abs/2602.13248v1
- Date: Mon, 02 Feb 2026 07:18:25 GMT
- Title: X-Blocks: Linguistic Building Blocks of Natural Language Explanations for Automated Vehicles
- Authors: Ashkan Y. Zadeh, Xiaomeng Li, Andry Rakotonirainy, Ronald Schroeter, Sebastien Glaser, Zishuo Zhu,
- Abstract summary: Natural language explanations play a critical role in establishing trust and acceptance of automated vehicles (AVs)<n>This paper introduces X-Blocks, a hierarchical analytical framework that identifies the linguistic building blocks of natural language explanations for AVs at three levels: context, syntax, and lexicon.<n> RACE achieves 91.45 percent accuracy and a Cohens kappa of 0.91 against cases with human annotator agreement, indicating near-human reliability for context classification.
- Score: 14.815119135668247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language explanations play a critical role in establishing trust and acceptance of automated vehicles (AVs), yet existing approaches lack systematic frameworks for analysing how humans linguistically construct driving rationales across diverse scenarios. This paper introduces X-Blocks (eXplanation Blocks), a hierarchical analytical framework that identifies the linguistic building blocks of natural language explanations for AVs at three levels: context, syntax, and lexicon. At the context level, we propose RACE (Reasoning-Aligned Classification of Explanations), a multi-LLM ensemble framework that combines Chain-of-Thought reasoning with self-consistency mechanisms to robustly classify explanations into 32 scenario-aware categories. Applied to human-authored explanations from the Berkeley DeepDrive-X dataset, RACE achieves 91.45 percent accuracy and a Cohens kappa of 0.91 against cases with human annotator agreement, indicating near-human reliability for context classification. At the lexical level, log-odds analysis with informative Dirichlet priors reveals context-specific vocabulary patterns that distinguish driving scenarios. At the syntactic level, dependency parsing and template extraction show that explanations draw from a limited repertoire of reusable grammar families, with systematic variation in predicate types and causal constructions across contexts. The X-Blocks framework is dataset-agnostic and task-independent, offering broad applicability to other automated driving datasets and safety-critical domains. Overall, our findings provide evidence-based linguistic design principles for generating scenario-aware explanations that support transparency, user trust, and cognitive accessibility in automated driving systems.
Related papers
- The Algorithmic Unconscious: Structural Mechanisms and Implicit Biases in Large Language Models [0.0]
This article introduces the concept of the algorithmic unconscious to designate the set of structural determinations that operate within large language models (LLMs)<n>We argue that a significant class of biases emerges directly from the technical mechanisms of the models themselves: tokenization, attention, statistical optimization, and alignment procedures.<n>We propose a framework for a technical clinic of models, grounded in the audit of tokenization regimes, latent space topology, and alignment systems.
arXiv Detail & Related papers (2026-02-08T16:03:43Z) - Deep networks learn to parse uniform-depth context-free languages from local statistics [12.183764229746926]
Understanding how the structure of language can be learned from sentences alone is a central question in both cognitive science and machine learning.<n>We introduce a class of context-free grammars (PCFGs) in which both the degree of ambiguity and the correlation structure across scales can be controlled.<n>We propose a unifying framework where correlations at different scales lift local ambiguities, enabling the emergence of hierarchical representations of the data.
arXiv Detail & Related papers (2026-01-31T17:35:06Z) - Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming [3.641087660577424]
We introduce a novel framework integrating a multi-agent system, powered by Large Language Models (LLMs), with Inductive Logic Programming (ILP)<n>Our system's LLM agents autonomously define a structured symbolic vocabulary (predicates) and relational templates.<n>Experiments in diverse, challenging scenarios validate superior performance, paving a new path for automated, explainable, and verifiable hypothesis generation.
arXiv Detail & Related papers (2025-05-27T17:53:38Z) - AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection [44.66668435489055]
AGENT-X is a zero-shot multi-agent framework for AI-generated text detection.<n>We organize detection guidelines into semantic, stylistic, and structural dimensions, each independently evaluated by specialized linguistic agents.<n>A meta agent integrates these assessments through confidence-aware aggregation, enabling threshold-free, interpretable classification.<n>Experiments on diverse datasets demonstrate that AGENT-X substantially surpasses state-of-the-art supervised and zero-shot approaches in accuracy, interpretability, and generalization.
arXiv Detail & Related papers (2025-05-21T08:39:18Z) - Data2Concept2Text: An Explainable Multilingual Framework for Data Analysis Narration [42.95840730800478]
This paper presents a complete explainable system that interprets a set of data, abstracts the underlying features and describes them in a natural language of choice.<n>The system relies on two crucial stages: (i) identifying emerging properties from data and transforming them into abstract concepts, and (ii) converting these concepts into natural language.
arXiv Detail & Related papers (2025-02-13T11:49:48Z) - How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - Modeling Hierarchical Reasoning Chains by Linking Discourse Units and
Key Phrases for Reading Comprehension [80.99865844249106]
We propose a holistic graph network (HGN) which deals with context at both discourse level and word level, as the basis for logical reasoning.
Specifically, node-level and type-level relations, which can be interpreted as bridges in the reasoning process, are modeled by a hierarchical interaction mechanism.
arXiv Detail & Related papers (2023-06-21T07:34:27Z) - Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings and reasoning mechanisms is a significant challenge.<n>We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences.<n>We demonstrate that generative models like GPT can accurately learn and reason over CFG-defined hierarchies and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z) - Semantic Role Labeling Meets Definition Modeling: Using Natural Language
to Describe Predicate-Argument Structures [104.32063681736349]
We present an approach to describe predicate-argument structures using natural language definitions instead of discrete labels.
Our experiments and analyses on PropBank-style and FrameNet-style, dependency-based and span-based SRL also demonstrate that a flexible model with an interpretable output does not necessarily come at the expense of performance.
arXiv Detail & Related papers (2022-12-02T11:19:16Z) - AUTOLEX: An Automatic Framework for Linguistic Exploration [93.89709486642666]
We propose an automatic framework that aims to ease linguists' discovery and extraction of concise descriptions of linguistic phenomena.
Specifically, we apply this framework to extract descriptions for three phenomena: morphological agreement, case marking, and word order.
We evaluate the descriptions with the help of language experts and propose a method for automated evaluation when human evaluation is infeasible.
arXiv Detail & Related papers (2022-03-25T20:37:30Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.