Related papers: The Shape of Reasoning: Topological Analysis of Reasoning Traces in Large Language Models

The Shape of Reasoning: Topological Analysis of Reasoning Traces in Large Language Models

URL: http://arxiv.org/abs/2510.20665v1
Date: Thu, 23 Oct 2025 15:43:43 GMT
Title: The Shape of Reasoning: Topological Analysis of Reasoning Traces in Large Language Models
Authors: Xue Wen Tan, Nathaniel Tan, Galen Lee, Stanley Kok,
Abstract summary: We introduce a topological data analysis framework that captures the geometry of reasoning traces and enables label-efficient assessment.<n>We show that a compact, stable set of topological features reliably indicates trace quality, offering a practical signal for future reinforcement learning algorithms.
Score: 2.846561253333858
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Evaluating the quality of reasoning traces from large language models remains understudied, labor-intensive, and unreliable: current practice relies on expert rubrics, manual annotation, and slow pairwise judgments. Automated efforts are dominated by graph-based proxies that quantify structural connectivity but do not clarify what constitutes high-quality reasoning; such abstractions can be overly simplistic for inherently complex processes. We introduce a topological data analysis (TDA)-based evaluation framework that captures the geometry of reasoning traces and enables label-efficient, automated assessment. In our empirical study, topological features yield substantially higher predictive power for assessing reasoning quality than standard graph metrics, suggesting that effective reasoning is better captured by higher-dimensional geometric structures rather than purely relational graphs. We further show that a compact, stable set of topological features reliably indicates trace quality, offering a practical signal for future reinforcement learning algorithms.

Related papers

On Multi-Step Theorem Prediction via Non-Parametric Structural Priors [50.16583672681106]
In this work, we explore training-free theorem prediction through the lens of in-context learning (ICL)<n>We propose Theorem Precedence Graphs, which encode temporal dependencies from historical solution traces as directed graphs, and impose explicit topological constraints that effectively prune the search space during inference.<n>Experiments on the FormalGeo7k benchmark show that our method achieves 89.29% accuracy, substantially outperforming ICL baselines and matching state-of-the-art supervised models.
arXiv Detail & Related papers (2026-03-05T06:08:50Z)
Unmasking Reasoning Processes: A Process-aware Benchmark for Evaluating Structural Mathematical Reasoning in LLMs [20.82580343824728]
Recent large language models (LLMs) achieve near-saturation accuracy on many established mathematical reasoning benchmarks.<n>This saturation stems from the dominance of template-based computation and shallow arithmetic decomposition.<n>We introduce ReasoningMath-Plus, a benchmark of 150 carefully curated problems explicitly designed to evaluate structural reasoning.
arXiv Detail & Related papers (2026-01-31T07:09:17Z)
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models [56.656180566692946]
We adopt Schoenfeld's Episode Theory as an inductive, intermediate-scale lens and introduce ThinkARM (Anatomy of Reasoning in Models)<n>ThinkARM explicitly abstracts reasoning traces into functional reasoning steps such as Analysis, Explore, Implement, verify, etc.<n>We show that episode-level representations make reasoning steps explicit, enabling systematic analysis of how reasoning is structured, stabilized, and altered in modern language models.
arXiv Detail & Related papers (2025-12-23T02:44:25Z)
TopInG: Topologically Interpretable Graph Learning via Persistent Rationale Filtration [10.830399323047265]
We propose TopInG: Topologically Interpretable Graph Learning, a novel framework to identify persistent rationale subgraphs.<n>TopInG employs a rationale filtration learning approach to model an autoregressive generation process of rationale subgraphs.<n>Our approach improves upon state-of-the-art methods on both predictive accuracy and interpretation quality.
arXiv Detail & Related papers (2025-10-06T17:59:44Z)
Demystifying Topological Message-Passing with Relational Structures: A Case Study on Oversquashing in Simplicial Message-Passing [11.759008086355914]
Topological deep learning (TDL) has emerged as a powerful tool for modeling higher-order interactions in relational data.<n>We propose a unifying axiomatic framework that bridges graph and topological message-passing.
arXiv Detail & Related papers (2025-06-06T23:31:36Z)
Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.<n>Models may behave unreliably due to poorly explored failure modes.<n> causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation [78.54656076915565]
Topological correctness plays a critical role in many image segmentation tasks.<n>Most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy.<n>We propose a novel, graph-based framework for topologically accurate image segmentation.
arXiv Detail & Related papers (2024-11-05T16:20:14Z)
Is Smoothness the Key to Robustness? A Comparison of Attention and Convolution Models Using a Novel Metric [0.0]
Existing robustness evaluation approaches often lack theoretical generality or rely heavily on empirical assessments. We propose TopoLip, a metric based on layer-wise analysis that bridges topological data analysis and Lipschitz continuity for robustness evaluation.
arXiv Detail & Related papers (2024-10-23T07:44:14Z)
Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic [51.967603572656266]
We introduce a consistent and theoretically grounded approach to annotating decompositional entailment. We find that our new dataset, RDTE, has a substantially higher internal consistency (+9%) than prior decompositional entailment datasets. We also find that training an RDTE-oriented entailment classifier via knowledge distillation and employing it in an entailment tree reasoning engine significantly improves both accuracy and proof quality.
arXiv Detail & Related papers (2024-02-22T18:55:17Z)
Modeling Hierarchical Reasoning Chains by Linking Discourse Units and Key Phrases for Reading Comprehension [80.99865844249106]
We propose a holistic graph network (HGN) which deals with context at both discourse level and word level, as the basis for logical reasoning. Specifically, node-level and type-level relations, which can be interpreted as bridges in the reasoning process, are modeled by a hierarchical interaction mechanism.
arXiv Detail & Related papers (2023-06-21T07:34:27Z)
On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training [109.9218185711916]
Aspect-based sentiment analysis (ABSA) aims at automatically inferring the specific sentiment polarities toward certain aspects of products or services behind social media texts or reviews. We propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training.
arXiv Detail & Related papers (2023-04-19T11:07:43Z)
On the Expressivity of Persistent Homology in Graph Learning [13.608942872770855]
Persistent homology, a technique from computational topology, has recently shown strong empirical performance in the context of graph classification.<n>This paper provides a brief introduction to persistent homology in the context of graphs, as well as a theoretical discussion and empirical analysis of its expressivity for graph learning tasks.
arXiv Detail & Related papers (2023-02-20T08:19:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.