Efficient Cover Construction for Ball Mapper via Accelerated Range Queries
- URL: http://arxiv.org/abs/2601.01405v1
- Date: Sun, 04 Jan 2026 07:04:18 GMT
- Title: Efficient Cover Construction for Ball Mapper via Accelerated Range Queries
- Authors: Jay-Anne Bulauan, John Rick Manzanares,
- Abstract summary: We study strategies for accelerating cover construction in Ball Mapper by improving the efficiency of range queries.<n>We integrate two complementary approaches into the Ball Mapper pipeline: hierarchical geometric pruning using ball tree data structures, and hardware-aware distance computation using Facebook AI Similarity Search.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ball Mapper is an widely used tool in topological data analysis for summarizing the structure of high-dimensional data through metric-based coverings and graph representations. A central computational bottleneck in Ball Mapper is the construction of the underlying cover, which requires repeated range queries to identify data points within a fixed distance of selected landmarks. As data sets grow in size and dimensionality, naive implementations of this step become increasingly inefficient. In this work, we study practical strategies for accelerating cover construction in Ball Mapper by improving the efficiency of range queries. We integrate two complementary approaches into the Ball Mapper pipeline: hierarchical geometric pruning using ball tree data structures, and hardware-aware distance computation using Facebook AI Similarity Search. We describe the underlying algorithms, discuss their trade-offs with respect to metric flexibility and dimensionality, and provide implementation details relevant to large-scale data analysis. Empirical benchmarks demonstrate that both approaches yield substantial speedups over the baseline implementation, with performance gains depending on data set size, dimensionality, and choice of distance function. These results improve the practical scalability of Ball Mapper without modifying its theoretical formulation and provide guidance for the efficient implementation of metric-based exploratory tools in modern data analysis workflows.
Related papers
- Geometric Operator Learning with Optimal Transport [77.16909146519227]
We propose integrating optimal transport (OT) into operator learning for partial differential equations (PDEs) on complex geometries.<n>For 3D simulations focused on surfaces, our OT-based neural operator embeds the surface geometry into a 2D parameterized latent space.<n> Experiments with Reynolds-averaged Navier-Stokes equations (RANS) on the ShapeNet-Car and DrivAerNet-Car datasets show that our method achieves better accuracy and also reduces computational expenses.
arXiv Detail & Related papers (2025-07-26T21:28:25Z) - Graph Vertex Embeddings: Distance, Regularization and Community Detection [0.0]
Graph embeddings have emerged as a powerful tool for representing complex network structures in a low-dimensional space.
We present a family of flexible distance functions that faithfully capture the topological distance between different vertices.
We evaluate the effectiveness of our proposed embedding constructions by performing community detection on a host of benchmark datasets.
arXiv Detail & Related papers (2024-04-09T09:03:53Z) - Piecewise-Linear Manifolds for Deep Metric Learning [8.670873561640903]
Unsupervised deep metric learning focuses on learning a semantic representation space using only unlabeled data.
We propose to model the high-dimensional data manifold using a piecewise-linear approximation, with each low-dimensional linear piece approximating the data manifold in a small neighborhood of a point.
We empirically show that this similarity estimate correlates better with the ground truth than the similarity estimates of current state-of-the-art techniques.
arXiv Detail & Related papers (2024-03-22T06:22:20Z) - Differentiable Mapper For Topological Optimization Of Data
Representation [33.33724208084121]
We build on a recently proposed framework incorporating topology to provide the first filter optimization scheme for Mapper graphs.
We demonstrate the usefulness of our approach by optimizing Mapper graph representations on several datasets.
arXiv Detail & Related papers (2024-02-20T09:33:22Z) - A distribution-guided Mapper algorithm [0.3683202928838613]
We introduce a distribution guided Mapper algorithm named D-Mapper.<n>Our proposed algorithm is a probabilistic model-based approach, which could serve as an alternative to non-prababilistic ones.<n>Our numerical experiments indicate that the D-Mapper outperforms the classical Mapper algorithm in various scenarios.
arXiv Detail & Related papers (2024-01-19T17:07:05Z) - Data-driven abstractions via adaptive refinements and a Kantorovich
metric [extended version] [56.94699829208978]
We introduce an adaptive refinement procedure for smart, and scalable abstraction of dynamical systems.
In order to learn the optimal structure, we define a Kantorovich-inspired metric between Markov chains.
We show that our method yields a much better computational complexity than using classical linear programming techniques.
arXiv Detail & Related papers (2023-03-30T11:26:40Z) - Robust Self-Tuning Data Association for Geo-Referencing Using Lane Markings [44.4879068879732]
This paper presents a complete pipeline for resolving ambiguities during the data association.
Its core is a robust self-tuning data association that adapts the search area depending on the entropy of the measurements.
We evaluate our method on real data from urban and rural scenarios around the city of Karlsruhe in Germany.
arXiv Detail & Related papers (2022-07-28T12:29:39Z) - Shapley-NAS: Discovering Operation Contribution for Neural Architecture
Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search.
We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - A Practical Index Structure Supporting Fr\'echet Proximity Queries Among
Trajectories [1.9335262420787858]
We present a scalable approach for range and $k$ nearest neighbor queries under computationally expensive metrics.
Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories.
We analyze the efficiency and effectiveness of our methods with extensive experiments on diverse synthetic and real-world data sets.
arXiv Detail & Related papers (2020-05-28T04:12:43Z) - Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian.
The graph reasoning is directly performed in the original feature space organized as a spatial pyramid.
We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z) - Augmented Parallel-Pyramid Net for Attention Guided Pose-Estimation [90.28365183660438]
This paper proposes an augmented parallel-pyramid net with attention partial module and differentiable auto-data augmentation.
We define a new pose search space where the sequences of data augmentations are formulated as a trainable and operational CNN component.
Notably, our method achieves the top-1 accuracy on the challenging COCO keypoint benchmark and the state-of-the-art results on the MPII datasets.
arXiv Detail & Related papers (2020-03-17T03:52:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.