MCbiF: Measuring Topological Autocorrelation in Multiscale Clusterings via 2-Parameter Persistent Homology
- URL: http://arxiv.org/abs/2510.14710v1
- Date: Thu, 16 Oct 2025 14:11:12 GMT
- Title: MCbiF: Measuring Topological Autocorrelation in Multiscale Clusterings via 2-Parameter Persistent Homology
- Authors: Juni Schindler, Mauricio Barahona,
- Abstract summary: We define the Multiscale Clustering Bifiltration (MCbiF) as a filtration of abstract simplicial complexes that encodes cluster intersection patterns across scales.<n>We show that the persistent homology (MPH) of the MCbiF yields a finitely presented and block decomposable module.<n>We demonstrate through experiments the use of MCbiF Hilbert functions as topological feature maps for downstream machine learning tasks.
- Score: 1.5813217907813781
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Datasets often possess an intrinsic multiscale structure with meaningful descriptions at different levels of coarseness. Such datasets are naturally described as multi-resolution clusterings, i.e., not necessarily hierarchical sequences of partitions across scales. To analyse and compare such sequences, we use tools from topological data analysis and define the Multiscale Clustering Bifiltration (MCbiF), a 2-parameter filtration of abstract simplicial complexes that encodes cluster intersection patterns across scales. The MCbiF can be interpreted as a higher-order extension of Sankey diagrams and reduces to a dendrogram for hierarchical sequences. We show that the multiparameter persistent homology (MPH) of the MCbiF yields a finitely presented and block decomposable module, and its stable Hilbert functions characterise the topological autocorrelation of the sequence of partitions. In particular, at dimension zero, the MPH captures violations of the refinement order of partitions, whereas at dimension one, the MPH captures higher-order inconsistencies between clusters across scales. We demonstrate through experiments the use of MCbiF Hilbert functions as topological feature maps for downstream machine learning tasks. MCbiF feature maps outperform information-based baseline features on both regression and classification tasks on synthetic sets of non-hierarchical sequences of partitions. We also show an application of MCbiF to real-world data to measure non-hierarchies in wild mice social grouping patterns across time.
Related papers
- MS-ISSM: Objective Quality Assessment of Point Clouds Using Multi-scale Implicit Structural Similarity [65.85858856481131]
unstructured and irregular nature of point clouds poses a significant challenge for objective quality assessment (PCQA)<n>We propose the Multi-scale Implicit Structural Similarity Measurement (MS-ISSM)
arXiv Detail & Related papers (2026-01-03T14:58:52Z) - Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage [52.914168158222765]
We detail a comprehensive Bayesian framework for learning DBNs.<n>We give a novel Markov chain Monte Carlo (MCMC) algorithm utilizing parallel Langevin proposals to generate exact posterior samples.<n>We apply our methodology to uncover prognostic network structure from primary breast cancer samples.
arXiv Detail & Related papers (2025-09-16T17:24:35Z) - Topology-Driven Clustering: Enhancing Performance with Betti Number Filtration [14.904264782690639]
Clustering complex datasets containing intertwined shapes poses significant challenges.<n>We introduce the concept of the Betti sequences to capture flexibly essential features from the topological structures.<n>Our proposed algorithm is adept at clustering complex, intertwined shapes contained in the datasets.
arXiv Detail & Related papers (2025-05-07T11:46:02Z) - Classification of Firn Data via Topological Features [2.3592914313389253]
We evaluate the performance of topological features for generalizable and robust classification of firn image data.<n>Firn refers to layers of granular snow within glaciers that haven't been compressed into ice.
arXiv Detail & Related papers (2025-04-22T14:33:33Z) - Hallucination Detection in LLMs with Topological Divergence on Attention Graphs [60.83579255387347]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models.<n>We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z) - CoHiRF: A Scalable and Interpretable Clustering Framework for High-Dimensional Data [0.30723404270319693]
We propose Consensus Hierarchical Random Feature (CoHiRF), a novel clustering method designed to address challenges effectively.<n>CoHiRF leverages random feature selection to mitigate noise and dimensionality effects, repeatedly applies K-Means clustering in reduced feature spaces, and combines results through a unanimous consensus criterion.<n>CoHiRF is computationally efficient with a running time comparable to K-Means, scalable to massive datasets, and exhibits robust performance against state-of-the-art methods such as SC-SRGF, HDBSCAN, and OPTICS.
arXiv Detail & Related papers (2025-02-01T09:38:44Z) - Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Analysing Multiscale Clusterings with Persistent Homology [0.8287206589886881]
We introduce the Multiscale Clustering filtration (MCF)<n>MCF encodes arbitrary cluster assignments in a sequence of partitions across scales of increasing coarseness.<n>We show that the zero-dimensional persistent homology of the MCF measures the degree of hierarchy of this sequence.
arXiv Detail & Related papers (2023-05-07T14:10:34Z) - Enhancing cluster analysis via topological manifold learning [0.3823356975862006]
We show that inferring the topological structure of a dataset before clustering can considerably enhance cluster detection.
We combine manifold learning method UMAP for inferring the topological structure with density-based clustering method DBSCAN.
arXiv Detail & Related papers (2022-07-01T15:53:39Z) - Finite-Function-Encoding Quantum States [52.77024349608834]
We introduce finite-function-encoding (FFE) states which encode arbitrary $d$-valued logic functions.
We investigate some of their structural properties.
arXiv Detail & Related papers (2020-12-01T13:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.