Persistent Topological Features in Large Language Models
- URL: http://arxiv.org/abs/2410.11042v3
- Date: Fri, 13 Jun 2025 12:27:46 GMT
- Title: Persistent Topological Features in Large Language Models
- Authors: Yuri Gardinazzi, Karthik Viswanathan, Giada Panerai, Alessio Ansuini, Alberto Cazzaniga, Matteo Biagetti,
- Abstract summary: We introduce topological descriptors that measure how topological features, $p$-dimensional holes, persist and evolve throughout the layers.<n>This offers a statistical perspective on how prompts are rearranged and their relative positions changed in the representation space.<n>As a showcase application, we use zigzag persistence to establish a criterion for layer pruning, achieving results comparable to state-of-the-art methods.
- Score: 0.6597195879147556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the decision-making processes of large language models is critical given their widespread applications. To achieve this, we aim to connect a formal mathematical framework - zigzag persistence from topological data analysis - with practical and easily applicable algorithms. Zigzag persistence is particularly effective for characterizing data as it dynamically transforms across model layers. Within this framework, we introduce topological descriptors that measure how topological features, $p$-dimensional holes, persist and evolve throughout the layers. Unlike methods that assess each layer individually and then aggregate the results, our approach directly tracks the full evolutionary path of these features. This offers a statistical perspective on how prompts are rearranged and their relative positions changed in the representation space, providing insights into the system's operation as an integrated whole. To demonstrate the expressivity and applicability of our framework, we highlight how sensitive these descriptors are to different models and a variety of datasets. As a showcase application to a downstream task, we use zigzag persistence to establish a criterion for layer pruning, achieving results comparable to state-of-the-art methods while preserving the system-level perspective.
Related papers
- Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z) - Reinforcement Learning-Based Dynamic Grouping for Tubular Structure Tracking [14.048453741483092]
We propose a novel framework that casts segment-wise tracking as a Markov Decision Process (MDP)<n>Our method leverages Q-Learning to dynamically explore a graph of segments, computing edge weights on-demand and adaptively expanding the search space.<n> Experimental reuslts on typical tubular structure datasets demonstrate that our method significantly outperforms state-of-the-art point-wise and segment-wise approaches.
arXiv Detail & Related papers (2025-06-21T11:00:17Z) - Automated Manifold Learning for Reduced Order Modeling [1.1126342180866644]
We investigate the use of Geometric Representation Learning for the data-driven discovery of system dynamics from spatial-temporal data.<n>We propose to encode similarity structure in such data in a spatial-temporal proximity graph.<n>We apply a range of classical and deep learning-based manifold learning approaches to learn reduced order dynamics.
arXiv Detail & Related papers (2025-06-02T14:49:55Z) - Place Recognition Meet Multiple Modalitie: A Comprehensive Review, Current Challenges and Future Directions [2.4775350526606355]
We review recent advancements in place recognition, emphasizing three methodological paradigms.<n>CNN-based approaches, Transformer-based frameworks, and cross-modal strategies are discussed.<n>We identify current research challenges and outline prospective directions, including domain adaptation, real-time performance, and lifelong learning, to inspire future advancements in this domain.
arXiv Detail & Related papers (2025-05-20T08:16:37Z) - Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured Meshes [50.23625950905638]
Mesh saliency enhances the adaptability of 3D vision by identifying and emphasizing regions that naturally attract visual attention.
We introduce mesh Mamba, a unified saliency prediction model based on a state space model (SSM)
Mesh Mamba effectively analyzes the geometric structure of the mesh while seamlessly incorporating texture features into the topological framework.
arXiv Detail & Related papers (2025-04-02T08:22:25Z) - Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts [3.9426000822656224]
We conjecture that in large language models, the embeddings live in a local manifold structure with different dimensions depending on the perplexities and domains of the input data.
By incorporating an attention-based soft-gating network, we verify that our model learns specialized sub-manifolds for an ensemble of input data sources.
arXiv Detail & Related papers (2025-02-19T09:33:16Z) - Geometry Matters: Benchmarking Scientific ML Approaches for Flow Prediction around Complex Geometries [23.111935712144277]
Rapid and accurate simulations of fluid dynamics around complicated geometric bodies are critical in a variety of engineering and scientific applications.
While scientific machine learning (SciML) has shown considerable promise, most studies in this field are limited to simple geometries.
This paper addresses this gap by benchmarking diverse SciML models for fluid flow prediction over intricate geometries.
arXiv Detail & Related papers (2024-12-31T00:23:15Z) - Conformable Convolution for Topologically Aware Learning of Complex Anatomical Structures [38.20599800950335]
We introduce Conformable Convolution, a novel convolutional layer designed to explicitly enforce topological consistency.<n>Topological Posterior Generator (TPG) module identifies key topological features and guides the convolutional layers.<n>We showcase the effectiveness of our framework in the segmentation task, where preserving the interconnectedness of structures is critical.
arXiv Detail & Related papers (2024-12-29T22:41:33Z) - Object Style Diffusion for Generalized Object Detection in Urban Scene [69.04189353993907]
We introduce a novel single-domain object detection generalization method, named GoDiff.<n>By integrating pseudo-target domain data with source domain data, we diversify the training dataset.<n> Experimental results demonstrate that our method not only enhances the generalization ability of existing detectors but also functions as a plug-and-play enhancement for other single-domain generalization methods.
arXiv Detail & Related papers (2024-12-18T13:03:00Z) - Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - Topological Perspectives on Optimal Multimodal Embedding Spaces [0.0]
This paper delves into a comparative analysis between CLIP and its recent counterpart, CLOOB.
Our approach encompasses a comprehensive examination of the modality gap drivers, the clustering structures existing across both high and low dimensions, and the pivotal role that dimension collapse plays in shaping their respective embedding spaces.
arXiv Detail & Related papers (2024-05-29T08:28:23Z) - Topological Parallax: A Geometric Specification for Deep Perception
Models [0.778001492222129]
We introduce topological parallax as a theoretical and computational tool that compares a trained model to a reference dataset.
Our examples show that this geometric similarity between dataset and model is essential to trustworthy and perturbation.
This new concept will add value to the current debate regarding the unclear relationship between overfitting and generalization in applications of deep-learning.
arXiv Detail & Related papers (2023-06-20T18:45:24Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - $k$-Means Clustering for Persistent Homology [0.0]
We prove convergence of the $k$-means clustering algorithm on persistence diagram space.
We also establish theoretical properties of the solution to the optimization problem in the Karush--Kuhn--Tucker framework.
arXiv Detail & Related papers (2022-10-18T17:18:51Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - The Geometry of Self-supervised Learning Models and its Impact on
Transfer Learning [62.601681746034956]
Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision.
We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each.
arXiv Detail & Related papers (2022-09-18T18:15:38Z) - Bending Graphs: Hierarchical Shape Matching using Gated Optimal
Transport [80.64516377977183]
Shape matching has been a long-studied problem for the computer graphics and vision community.
We investigate a hierarchical learning design, to which we incorporate local patch-level information and global shape-level structures.
We propose a novel optimal transport solver by recurrently updating features on non-confident nodes to learn globally consistent correspondences between the shapes.
arXiv Detail & Related papers (2022-02-03T11:41:46Z) - Image Synthesis via Semantic Composition [74.68191130898805]
We present a novel approach to synthesize realistic images based on their semantic layouts.
It hypothesizes that for objects with similar appearance, they share similar representation.
Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations.
arXiv Detail & Related papers (2021-09-15T02:26:07Z) - Learning the Implicit Semantic Representation on Graph-Structured Data [57.670106959061634]
Existing representation learning methods in graph convolutional networks are mainly designed by describing the neighborhood of each node as a perceptual whole.
We propose a Semantic Graph Convolutional Networks (SGCN) that explores the implicit semantics by learning latent semantic-paths in graphs.
arXiv Detail & Related papers (2021-01-16T16:18:43Z) - Learning Robust State Abstractions for Hidden-Parameter Block MDPs [55.31018404591743]
We leverage ideas of common structure from the HiP-MDP setting to enable robust state abstractions inspired by Block MDPs.
We derive instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings.
arXiv Detail & Related papers (2020-07-14T17:25:27Z) - Deep Learning modeling of Limit Order Book: a comparative perspective [0.0]
The present work addresses theoretical and practical questions in the domain of Deep Learning for High Frequency Trading.
State-of-the-art models such as Random models, Logistic Regressions, LSTMs, LSTMs equipped with an Attention mask, CNN-LSTM and Attentions are reviewed and compared on the same tasks.
The underlying dimensions of the modeling techniques are investigated to understand whether these are intrinsic to the Limit Order Book's dynamics.
arXiv Detail & Related papers (2020-07-12T17:06:30Z) - Hierarchical regularization networks for sparsification based learning
on noisy datasets [0.0]
hierarchy follows from approximation spaces identified at successively finer scales.
For promoting model generalization at each scale, we also introduce a novel, projection based penalty operator across multiple dimension.
Results show the performance of the approach as a data reduction and modeling strategy on both synthetic and real datasets.
arXiv Detail & Related papers (2020-06-09T18:32:24Z) - Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian.
The graph reasoning is directly performed in the original feature space organized as a spatial pyramid.
We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.