Related papers: Efficient Sparsification of Simplicial Complexes via Local Densities of States

Efficient Sparsification of Simplicial Complexes via Local Densities of States

URL: http://arxiv.org/abs/2502.07558v1
Date: Tue, 11 Feb 2025 13:51:42 GMT
Title: Efficient Sparsification of Simplicial Complexes via Local Densities of States
Authors: Anton Savostianov, Michael T. Schaub, Nicola Guglielmi, Francesco Tudisco,
Abstract summary: Simplicial complexes (SCs) are generalizations of graph models for computation data that account for higher-order relations between data items.<n>The analysis of many real-world datasets leads to dense SCs with a large number of higher-order interactions.<n>We develop a novel method for a probabilistic sparsifaction of SCs.
Score: 8.830922974884531
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Simplicial complexes (SCs), a generalization of graph models for relational data that account for higher-order relations between data items, have become a popular abstraction for analyzing complex data using tools from topological data analysis or topological signal processing. However, the analysis of many real-world datasets leads to dense SCs with a large number of higher-order interactions. Unfortunately, analyzing such large SCs often has a prohibitive cost in terms of computation time and memory consumption. The sparsification of such complexes, i.e., the approximation of an original SC with a sparser simplicial complex with only a log-linear number of high-order simplices while maintaining a spectrum close to the original SC, is of broad interest. In this work, we develop a novel method for a probabilistic sparsifaction of SCs. At its core lies the efficient computation of sparsifying sampling probability through local densities of states as functional descriptors of the spectral information. To avoid pathological structures in the spectrum of the corresponding Hodge Laplacian operators, we suggest a "kernel-ignoring" decomposition for approximating the sampling probability; additionally, we exploit error estimates to show asymptotically prevailing algorithmic complexity of the developed method. The performance of the framework is demonstrated on the family of Vietoris--Rips filtered simplicial complexes.

Related papers

Frequency-domain alignment of heterogeneous, multidimensional separations data through complex orthogonal Procrustes analysis [0.0]
Multidimensional separations data have the capacity to reveal detailed information about complex biological samples. Data analysis has been an ongoing challenge in the area since the peaks that represent chemical factors may drift over the course of several analytical runs. This work offers a very simple solution to the alignment problem through a Procrustes analysis of the frequency-domain representation of synthetic multidimensional separations data.
arXiv Detail & Related papers (2025-02-18T12:14:14Z)
Discovering physical laws with parallel combinatorial tree search [57.05912962368898]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade. We introduce a parallel tree search (PCTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure. We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z)
Generalized Simplicial Attention Neural Networks [22.171364354867723]
We introduce Generalized Simplicial Attention Neural Networks (GSANs) GSANs process data living on simplicial complexes using masked self-attentional layers. These schemes learn how to combine data associated with neighbor simplices of consecutive order in a task-oriented fashion.
arXiv Detail & Related papers (2023-09-05T11:29:25Z)
Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise. In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z)
Subspace clustering in high-dimensions: Phase transitions \& Statistical-to-Computational gap [24.073221004661427]
A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model. We provide an exact characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity.
arXiv Detail & Related papers (2022-05-26T17:47:35Z)
Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations. We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z)
Information-Theoretic Generalization Bounds for Iterative Semi-Supervised Learning [81.1071978288003]
In particular, we seek to understand the behaviour of the em generalization error of iterative SSL algorithms using information-theoretic principles. Our theoretical results suggest that when the class conditional variances are not too large, the upper bound on the generalization error decreases monotonically with the number of iterations, but quickly saturates.
arXiv Detail & Related papers (2021-10-03T05:38:49Z)
Self-paced Principal Component Analysis [17.333976289539457]
We propose a novel method called Self-paced PCA (SPCA) to further reduce the effect of noise and outliers. The complexity of each sample is calculated at the beginning of each iteration in order to integrate samples from simple to more complex into training.
arXiv Detail & Related papers (2021-06-25T20:50:45Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes [60.479499225746295]
We introduce a new scalable approximation for Gaussian processes with provable guarantees which hold simultaneously over its entire parameter space. Our approximation is obtained from an improved sample complexity analysis for sparse spectrum Gaussian processes (SSGPs)
arXiv Detail & Related papers (2020-11-17T05:41:50Z)
Sparse Generalized Canonical Correlation Analysis: Distributed Alternating Iteration based Approach [18.93565942407577]
Sparse canonical correlation analysis (CCA) is a useful statistical tool to detect latent information with sparse structures. We propose a generalized canonical correlation analysis (GCCA), which could detect the latent relations of multiview data with sparse structures.
arXiv Detail & Related papers (2020-04-23T05:53:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.