Component-Level Lesioning of Language Models Reveals Clinically Aligned Aphasia Phenotypes
- URL: http://arxiv.org/abs/2601.19723v1
- Date: Tue, 27 Jan 2026 15:47:22 GMT
- Title: Component-Level Lesioning of Language Models Reveals Clinically Aligned Aphasia Phenotypes
- Authors: Yifan Wang, Jichen Zheng, Jingyuan Sun, Yunhao Zhang, Chunyu Ye, Jixing Li, Chengqing Zong, Shaonan Wang,
- Abstract summary: We introduce a component-level framework that simulates aphasia by selectively perturbing functional components in large language models.<n>Our pipeline identifies subtype-linked components for Broca's and Wernicke's aphasia, and induces graded impairments by progressively perturbing the top-k subtype-linked components.<n>Across architectures and lesioning strategies, subtype-targeted perturbations yield more systematic, aphasia-like regressions than size-matched random perturbations.
- Score: 40.41503864764337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) increasingly exhibit human-like linguistic behaviors and internal representations that they could serve as computational simulators of language cognition. We ask whether LLMs can be systematically manipulated to reproduce language-production impairments characteristic of aphasia following focal brain lesions. Such models could provide scalable proxies for testing rehabilitation hypotheses, and offer a controlled framework for probing the functional organization of language. We introduce a clinically grounded, component-level framework that simulates aphasia by selectively perturbing functional components in LLMs, and apply it to both modular Mixture-of-Experts models and dense Transformers using a unified intervention interface. Our pipeline (i) identifies subtype-linked components for Broca's and Wernicke's aphasia, (ii) interprets these components with linguistic probing tasks, and (iii) induces graded impairments by progressively perturbing the top-k subtype-linked components, evaluating outcomes with Western Aphasia Battery (WAB) subtests summarized by Aphasia Quotient (AQ). Across architectures and lesioning strategies, subtype-targeted perturbations yield more systematic, aphasia-like regressions than size-matched random perturbations, and MoE modularity supports more localized and interpretable phenotype-to-component mappings. These findings suggest that modular LLMs, combined with clinically informed component perturbations, provide a promising platform for simulating aphasic language production and studying how distinct language functions degrade under targeted disruptions.
Related papers
- Stroke Lesions as a Rosetta Stone for Language Model Interpretability [6.528508321422611]
We present the Brain-LLM Unified Model (BLUM) as an external reference structure for evaluating large language models.<n>Using data from individuals with chronic post-stroke aphasia, we trained symptom-to-lesion models that predict brain damage location from behavioral error profiles.<n>BLUM error profiles were sufficiently similar to human error profiles that predicted lesions corresponded to actual lesions in error-matched humans above chance.
arXiv Detail & Related papers (2026-02-03T23:22:37Z) - A Monosemantic Attribution Framework for Stable Interpretability in Clinical Neuroscience Large Language Models [9.694820939059339]
Interpretability remains a key challenge for deploying large language models (LLMs) in clinical settings such as Alzheimer's disease progression diagnosis.<n>We introduce a unified interpretability framework that integrates attributional and mechanistic perspectives.
arXiv Detail & Related papers (2026-01-25T19:03:04Z) - E^2-LLM: Bridging Neural Signals and Interpretable Affective Analysis [54.763420895859035]
We present ELLM2-EEG-to-Emotion Large Language Model, first MLLM framework for interpretable emotion analysis from EEG.<n>ELLM integrates a pretrained EEG encoder with Q-based LLMs through learnable projection layers, employing a multi-stage training pipeline.<n>Experiments on the dataset across seven emotion categories demonstrate that ELLM2-EEG-to-Emotion Large Language Model achieves excellent performance on emotion classification.
arXiv Detail & Related papers (2026-01-11T13:21:20Z) - Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs [85.69785384599827]
Human-object interaction (HOI) detection aims to localize human-object pairs and the interactions between them.<n>Existing methods operate under a closed-world assumption, treating the task as a classification problem over a small, predefined verb set.<n>We propose GRASP-HO, a novel Generative Reasoning And Steerable Perception framework that reformulates HOI detection from the closed-set classification task to the open-vocabulary generation problem.
arXiv Detail & Related papers (2025-12-19T14:41:50Z) - Multi-View Attention Multiple-Instance Learning Enhanced by LLM Reasoning for Cognitive Distortion Detection [1.660734109310745]
We proposed a novel framework that combines Large Language Models (LLMs) with Multiple-Instance Learning (MIL) architecture to enhance interpretability and expression-level reasoning.<n>Our results suggested a psychologically grounded and generalizable approach for fine-grained reasoning in mental health NLP.
arXiv Detail & Related papers (2025-09-22T00:18:58Z) - Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models [56.61984030508691]
We present the first mechanistic interpretability study of language confusion.<n>We show that confusion points (CPs) are central to this phenomenon.<n>We show that editing a small set of critical neurons, identified via comparative analysis with a multilingual-tuned counterpart, substantially mitigates confusion.
arXiv Detail & Related papers (2025-05-22T11:29:17Z) - Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding [92.32881381717594]
We introduce ALternate Contrastive Decoding (ALCD) to solve hallucination issues in medical information extraction tasks.
ALCD demonstrates significant improvements in resolving hallucination issues compared to conventional decoding methods.
arXiv Detail & Related papers (2024-10-21T07:19:19Z) - A Methodology for Explainable Large Language Models with Integrated Gradients and Linguistic Analysis in Text Classification [2.556395214262035]
Neurological disorders that affect speech production, such as Alzheimer's Disease (AD), significantly impact the lives of both patients and caregivers.
Recent advancements in Large Language Model (LLM) architectures have developed many tools to identify representative features of neurological disorders through spontaneous speech.
This paper presents an explainable LLM method, named SLIME, capable of identifying lexical components representative of AD.
arXiv Detail & Related papers (2024-09-30T21:45:02Z) - Investigating the Timescales of Language Processing with EEG and Language Models [0.0]
This study explores the temporal dynamics of language processing by examining the alignment between word representations from a pre-trained language model and EEG data.
Using a Temporal Response Function (TRF) model, we investigate how neural activity corresponds to model representations across different layers.
Our analysis reveals patterns in TRFs from distinct layers, highlighting varying contributions to lexical and compositional processing.
arXiv Detail & Related papers (2024-06-28T12:49:27Z) - Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability.
The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts.
As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.