Multi-Scale Manifold Alignment for Interpreting Large Language Models: A Unified Information-Geometric Framework
- URL: http://arxiv.org/abs/2505.20333v2
- Date: Mon, 13 Oct 2025 16:09:37 GMT
- Title: Multi-Scale Manifold Alignment for Interpreting Large Language Models: A Unified Information-Geometric Framework
- Authors: Yukun Zhang, Qi Dong,
- Abstract summary: We present Multi-Scale Manifold Alignment (MSMA), an information-geometric framework that decomposes LLM representations into local, intermediate, and global manifold.<n>We observe consistent hierarchical patterns and find that MSMA improves alignment metrics under multiple estimators.<n> Controlled interventions at different scales yield distinct and architecture-dependent effects on lexical diversity, sentence structure, and discourse coherence.
- Score: 4.935224714809964
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Multi-Scale Manifold Alignment(MSMA), an information-geometric framework that decomposes LLM representations into local, intermediate, and global manifolds and learns cross-scale mappings that preserve geometry and information. Across GPT-2, BERT, RoBERTa, and T5, we observe consistent hierarchical patterns and find that MSMA improves alignment metrics under multiple estimators (e.g., relative KL reduction and MI gains with statistical significance across seeds). Controlled interventions at different scales yield distinct and architecture-dependent effects on lexical diversity, sentence structure, and discourse coherence. While our theoretical analysis relies on idealized assumptions, the empirical results suggest that multi-objective alignment offers a practical lens for analyzing cross-scale information flow and guiding representation-level control.
Related papers
- MS-ISSM: Objective Quality Assessment of Point Clouds Using Multi-scale Implicit Structural Similarity [65.85858856481131]
unstructured and irregular nature of point clouds poses a significant challenge for objective quality assessment (PCQA)<n>We propose the Multi-scale Implicit Structural Similarity Measurement (MS-ISSM)
arXiv Detail & Related papers (2026-01-03T14:58:52Z) - SIGMMA: Hierarchical Graph-Based Multi-Scale Multi-modal Contrastive Alignment of Histopathology Image and Spatial Transcriptome [0.5748432401788427]
We propose Sigmma, a multi-modal contrastive alignment framework for learning hierarchical representations of HE images and spatial transcriptome profiles.<n>By representing cell interactions as a graph, our approach effectively captures cell-cell interactions, ranging from fine to coarse, within the tissue microenvironment.<n>We demonstrate that Sigmm learns representations that better capture cross-modal correspondences, leading to an improvement of avg. 9.78% in the gene-expression prediction task and avg. 26.93% in the cross-modal retrieval task across datasets.
arXiv Detail & Related papers (2025-11-19T14:22:23Z) - Graft: Integrating the Domain Knowledge via Efficient Parameter Synergy for MLLMs [56.76586846269894]
Multimodal Large Language Models (MLLMs) have achieved success across various domains.<n>Despite its importance, the study of knowledge sharing among domain-specific MLLMs remains largely underexplored.<n>We propose a unified parameter integration framework that enables modular composition of expert capabilities.
arXiv Detail & Related papers (2025-06-30T15:07:41Z) - EarthMind: Leveraging Cross-Sensor Data for Advanced Earth Observation Interpretation with a Unified Multimodal LLM [103.7537991413311]
Earth Observation (EO) data analysis is vital for monitoring environmental and human dynamics.<n>Recent Multimodal Large Language Models (MLLMs) show potential in EO understanding but remain restricted to single-sensor inputs.<n>We propose EarthMind, a unified vision-language framework that handles both single- and cross-sensor inputs.
arXiv Detail & Related papers (2025-06-02T13:36:05Z) - The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology [4.280045926995889]
This study focuses on how adversarial inputs systematically affect the internal representation spaces of Large Language Models.<n>By quantifying the shape of activations and neuronal information flow, our architecture-agnostic framework reveals fundamental invariants of representational change.
arXiv Detail & Related papers (2025-05-26T18:31:49Z) - Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion [52.315729095824906]
MLLM Semantic-Corrected Ping-Pong-Ahead Diffusion (PPAD) is a novel framework that introduces a Multimodal Large Language Model (MLLM) as a semantic observer during inference.<n>It performs real-time analysis on intermediate generations, identifies latent semantic inconsistencies, and translates feedback into controllable signals that actively guide the remaining denoising steps.<n>Extensive experiments demonstrate PPAD's significant improvements.
arXiv Detail & Related papers (2025-05-26T14:42:35Z) - LatentLLM: Attention-Aware Joint Tensor Compression [50.33925662486034]
Large language models (LLMs) and large multi-modal models (LMMs) require a massive amount of computational and memory resources.<n>We propose a new framework to convert such LLMs/LMMs into a reduced-dimension latent structure.
arXiv Detail & Related papers (2025-05-23T22:39:54Z) - Multi-Scale Probabilistic Generation Theory: A Hierarchical Framework for Interpreting Large Language Models [1.2027959564488593]
Large Transformer based language models achieve remarkable performance but remain opaque in how they plan, structure, and realize text.<n>We introduce Multi_Scale Probabilistic Generation Theory (MSPGT), a hierarchical framework that factorizes generation into three semantic scales_global context, intermediate structure, and local word choices.
arXiv Detail & Related papers (2025-05-23T16:55:35Z) - Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo [90.78001821963008]
A wide range of LM applications require generating text that conforms to syntactic or semantic constraints.<n>We develop an architecture for controlled LM generation based on sequential Monte Carlo (SMC)<n>Our system builds on the framework of Lew et al. (2023) and integrates with its language model probabilistic programming language.
arXiv Detail & Related papers (2025-04-17T17:49:40Z) - Model Hemorrhage and the Robustness Limits of Large Language Models [119.46442117681147]
Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment.<n>We define this phenomenon as model hemorrhage - performance decline caused by parameter alterations and architectural changes.
arXiv Detail & Related papers (2025-03-31T10:16:03Z) - Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence [83.15764564701706]
We propose a novel framework that performs vision-language alignment by integrating Cauchy-Schwarz divergence with mutual information.<n>We find that the CS divergence seamlessly addresses the InfoNCE's alignment-uniformity conflict and serves complementary roles with InfoNCE.<n> Experiments on text-to-image generation and cross-modality retrieval tasks demonstrate the effectiveness of our method on vision-language alignment.
arXiv Detail & Related papers (2025-02-24T10:29:15Z) - Contextual Subspace Manifold Projection for Structural Refinement of Large Language Model Representations [0.0]
Internal representations within deep neural architectures encode high-dimensional abstractions of linguistic structures.<n>This paper introduces a structured refinement technique that selectively reconfigures token embeddings through controlled subspace constraints.<n> Empirical evaluations demonstrated that the structured intervention reduced anisotropy, leading to improved representation compactness.
arXiv Detail & Related papers (2025-02-12T00:00:37Z) - Latent Lexical Projection in Large Language Models: A Novel Approach to Implicit Representation Refinement [0.0]
Latent Lexical Projection (LLP) is introduced to refine lexical representations through a structured transformation into a latent space.<n>LLP integrates an optimized projection mechanism within an existing language model architecture.<n> Evaluations indicate a reduction in perplexity and an increase in BLEU scores, suggesting improvements in predictive accuracy and fluency.
arXiv Detail & Related papers (2025-02-03T23:18:53Z) - Intrinsic Tensor Field Propagation in Large Language Models: A Novel Approach to Contextual Information Flow [0.0]
Intrinsic Field propagation improves contextual retention, dependency resolution, and inference across various linguistic structures.<n>Experiments conducted on an open-source transformer-based model demonstrate that I provides measurable improvements in contextual retention, dependency resolution, and inference across various linguistic structures.
arXiv Detail & Related papers (2025-01-31T08:32:32Z) - Semantic Layered Embedding Diffusion in Large Language Models for Multi-Contextual Consistency [0.0]
The Semantic Layered Embedding Diffusion (SLED) mechanism redefines the representation of hierarchical semantics within transformer-based architectures.<n>By introducing a multi-layered diffusion process grounded in spectral analysis, it achieves a complex balance between global and local semantic coherence.<n> Experimental results demonstrate significant improvements in perplexity and BLEU scores, emphasizing the mechanism's ability to adapt effectively across diverse domains.
arXiv Detail & Related papers (2025-01-26T05:17:04Z) - Does Representation Matter? Exploring Intermediate Layers in Large Language Models [22.704926222438456]
We investigate the quality of intermediate representations in large language models (LLMs)<n>We find that intermediate layers often yield more informative representations for downstream tasks than the final layers.<n>Our results illuminate the internal mechanics of LLMs and guide strategies for architectural optimization and training.
arXiv Detail & Related papers (2024-12-12T18:48:51Z) - Aligning Large Language Models and Geometric Deep Models for Protein Representation [57.59506688299817]
Latent representation alignment is used to map embeddings from different modalities into a shared space, often aligned with the embedding space of large language models (LLMs)<n>Preliminary protein-focused large language models (MLLMs) have emerged, but they have predominantly relied on approaches lacking a fundamental understanding of optimal alignment practices across representations.<n>In this study, we explore the alignment of multimodal representations between LLMs and Geometric Deep Models (GDMs) in the protein domain.<n>Our work examines alignment factors from both model and protein perspectives, identifying challenges in current alignment methodologies and proposing strategies to improve the alignment process.
arXiv Detail & Related papers (2024-11-08T04:15:08Z) - A Theoretical Analysis of Self-Supervised Learning for Vision Transformers [66.08606211686339]
Masked autoencoders (MAE) and contrastive learning (CL) capture different types of representations.<n>We study the training dynamics of one-layer softmax-based vision transformers (ViTs) on both MAE and CL objectives.
arXiv Detail & Related papers (2024-03-04T17:24:03Z) - Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning [32.178931149612644]
In-context learning enables language models to adapt to downstream data or incorporate tasks by few samples as demonstrations within the prompts.
However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations.
We propose a novel approach "vocabulary-defined semantics"
arXiv Detail & Related papers (2024-01-29T14:29:48Z) - Sparsity-Guided Holistic Explanation for LLMs with Interpretable
Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications.
We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z) - The Geometry of Self-supervised Learning Models and its Impact on
Transfer Learning [62.601681746034956]
Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision.
We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each.
arXiv Detail & Related papers (2022-09-18T18:15:38Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.