Efficient Sparsification of Simplicial Complexes via Local Densities of States
- URL: http://arxiv.org/abs/2502.07558v1
- Date: Tue, 11 Feb 2025 13:51:42 GMT
- Title: Efficient Sparsification of Simplicial Complexes via Local Densities of States
- Authors: Anton Savostianov, Michael T. Schaub, Nicola Guglielmi, Francesco Tudisco,
- Abstract summary: Simplicial complexes (SCs) are generalizations of graph models for computation data that account for higher-order relations between data items.<n>The analysis of many real-world datasets leads to dense SCs with a large number of higher-order interactions.<n>We develop a novel method for a probabilistic sparsifaction of SCs.
- Score: 8.830922974884531
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Simplicial complexes (SCs), a generalization of graph models for relational data that account for higher-order relations between data items, have become a popular abstraction for analyzing complex data using tools from topological data analysis or topological signal processing. However, the analysis of many real-world datasets leads to dense SCs with a large number of higher-order interactions. Unfortunately, analyzing such large SCs often has a prohibitive cost in terms of computation time and memory consumption. The sparsification of such complexes, i.e., the approximation of an original SC with a sparser simplicial complex with only a log-linear number of high-order simplices while maintaining a spectrum close to the original SC, is of broad interest. In this work, we develop a novel method for a probabilistic sparsifaction of SCs. At its core lies the efficient computation of sparsifying sampling probability through local densities of states as functional descriptors of the spectral information. To avoid pathological structures in the spectrum of the corresponding Hodge Laplacian operators, we suggest a "kernel-ignoring" decomposition for approximating the sampling probability; additionally, we exploit error estimates to show asymptotically prevailing algorithmic complexity of the developed method. The performance of the framework is demonstrated on the family of Vietoris--Rips filtered simplicial complexes.
Related papers
- Text Anomaly Detection with Simplified Isolation Kernel [58.13924648777626]
Two-step approaches combine pre-trained large language model embeddings and anomaly detectors.<n>High-dimensional dense embeddings extracted by large language models pose challenges due to substantial memory requirements and high computation time.<n>We introduce the Simplified Isolation Kernel (SIK), which maps high-dimensional dense embeddings to lower-dimensional sparse representations.
arXiv Detail & Related papers (2025-10-15T06:35:54Z) - Frequency-domain alignment of heterogeneous, multidimensional separations data through complex orthogonal Procrustes analysis [0.0]
Multidimensional separations data have the capacity to reveal detailed information about complex biological samples.
Data analysis has been an ongoing challenge in the area since the peaks that represent chemical factors may drift over the course of several analytical runs.
This work offers a very simple solution to the alignment problem through a Procrustes analysis of the frequency-domain representation of synthetic multidimensional separations data.
arXiv Detail & Related papers (2025-02-18T12:14:14Z) - Parallel Simulation for Log-concave Sampling and Score-based Diffusion Models [55.07411490538404]
We propose a novel parallel sampling method that improves adaptive complexity dependence on dimension $d$.<n>Our approach builds on parallel simulation techniques from scientific computing.
arXiv Detail & Related papers (2024-12-10T11:50:46Z) - Discovering physical laws with parallel combinatorial tree search [57.05912962368898]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data.
Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade.
We introduce a parallel tree search (PCTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Weighted Riesz Particles [0.0]
We consider the target distribution as a mapping where the infinite-dimensional space of the parameters consists of a number of deterministic submanifolds.
We study the properties of the point, called Riesz, and embed it into sequential MCMC.
We find that there will be higher acceptance rates with fewer evaluations.
arXiv Detail & Related papers (2023-12-01T14:36:46Z) - Generalized Simplicial Attention Neural Networks [22.171364354867723]
We introduce Generalized Simplicial Attention Neural Networks (GSANs)
GSANs process data living on simplicial complexes using masked self-attentional layers.
These schemes learn how to combine data associated with neighbor simplices of consecutive order in a task-oriented fashion.
arXiv Detail & Related papers (2023-09-05T11:29:25Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Subspace clustering in high-dimensions: Phase transitions \&
Statistical-to-Computational gap [24.073221004661427]
A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model.
We provide an exact characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity.
arXiv Detail & Related papers (2022-05-26T17:47:35Z) - Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency [111.83670279016599]
We study reinforcement learning for partially observed decision processes (POMDPs) with infinite observation and state spaces.
We make the first attempt at partial observability and function approximation for a class of POMDPs with a linear structure.
arXiv Detail & Related papers (2022-04-20T21:15:38Z) - Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations.
We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z) - Information-Theoretic Generalization Bounds for Iterative
Semi-Supervised Learning [81.1071978288003]
In particular, we seek to understand the behaviour of the em generalization error of iterative SSL algorithms using information-theoretic principles.
Our theoretical results suggest that when the class conditional variances are not too large, the upper bound on the generalization error decreases monotonically with the number of iterations, but quickly saturates.
arXiv Detail & Related papers (2021-10-03T05:38:49Z) - Self-paced Principal Component Analysis [17.333976289539457]
We propose a novel method called Self-paced PCA (SPCA) to further reduce the effect of noise and outliers.
The complexity of each sample is calculated at the beginning of each iteration in order to integrate samples from simple to more complex into training.
arXiv Detail & Related papers (2021-06-25T20:50:45Z) - From observations to complexity of quantum states via unsupervised
learning [0.0]
We use unsupervised learning with autoencoder neural networks to detect the local complexity of time-evolved states.
Our approach is an ideal diagnostics tool for data obtained from (noisy) quantum simulators because it requires only practically accessible local observations.
arXiv Detail & Related papers (2021-02-22T19:44:55Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Revisiting the Sample Complexity of Sparse Spectrum Approximation of
Gaussian Processes [60.479499225746295]
We introduce a new scalable approximation for Gaussian processes with provable guarantees which hold simultaneously over its entire parameter space.
Our approximation is obtained from an improved sample complexity analysis for sparse spectrum Gaussian processes (SSGPs)
arXiv Detail & Related papers (2020-11-17T05:41:50Z) - Sinkhorn Natural Gradient for Generative Models [125.89871274202439]
We propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence.
We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically.
In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.
arXiv Detail & Related papers (2020-11-09T02:51:17Z) - Investigating the Scalability and Biological Plausibility of the
Activation Relaxation Algorithm [62.997667081978825]
Activation Relaxation (AR) algorithm provides a simple and robust approach for approximating the backpropagation of error algorithm.
We show that the algorithm can be further simplified and made more biologically plausible by introducing a learnable set of backwards weights.
We also investigate whether another biologically implausible assumption of the original AR algorithm -- the frozen feedforward pass -- can be relaxed without damaging performance.
arXiv Detail & Related papers (2020-10-13T08:02:38Z) - Sparse Generalized Canonical Correlation Analysis: Distributed
Alternating Iteration based Approach [18.93565942407577]
Sparse canonical correlation analysis (CCA) is a useful statistical tool to detect latent information with sparse structures.
We propose a generalized canonical correlation analysis (GCCA), which could detect the latent relations of multiview data with sparse structures.
arXiv Detail & Related papers (2020-04-23T05:53:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.