DBSCAN in domains with periodic boundary conditions
- URL: http://arxiv.org/abs/2501.16894v1
- Date: Tue, 28 Jan 2025 12:26:07 GMT
- Title: DBSCAN in domains with periodic boundary conditions
- Authors: Xander M. de Wit, Alessandro Gabbana,
- Abstract summary: We present a method for applying a clustering algorithm to data embedded in a periodic domain based on the DBSCAN algorithm.
We demonstrate the workings of the proposed method using synthetic data in one, two and three dimensions and also apply it to a real-world example involving the clustering of bubbles in a turbulent flow.
- Score: 44.99833362998488
- License:
- Abstract: Many scientific problems involve data that is embedded in a space with periodic boundary conditions. This can for instance be related to an inherent cyclic or rotational symmetry in the data or a spatially extended periodicity. When analyzing such data, well-tailored methods are needed to obtain efficient approaches that obey the periodic boundary conditions of the problem. In this work, we present a method for applying a clustering algorithm to data embedded in a periodic domain based on the DBSCAN algorithm, a widely used unsupervised machine learning method that identifies clusters in data. The proposed method internally leverages the conventional DBSCAN algorithm for domains with open boundaries, such that it remains compatible with all optimized implementations for neighborhood searches in open domains. In this way, it retains the same optimized runtime complexity of $O(N\log N)$. We demonstrate the workings of the proposed method using synthetic data in one, two and three dimensions and also apply it to a real-world example involving the clustering of bubbles in a turbulent flow. The proposed approach is implemented in a ready-to-use Python package that we make publicly available.
Related papers
- Boundary Detection Algorithm Inspired by Locally Linear Embedding [8.259071011958254]
We propose a method for detecting boundary points inspired by the widely used locally linear embedding algorithm.
We implement this method using two nearest neighborhood search schemes: the $epsilon$-radius ball scheme and the $K$-nearest neighbor scheme.
arXiv Detail & Related papers (2024-06-26T16:05:57Z) - Faithful Density-Peaks Clustering via Matrix Computations on MPI Parallelization System [7.594123537718585]
Density peaks clustering (DP) has the ability of detecting clusters of arbitrary shape and clustering non-Euclidean space data.
We present a faithful and parallel DP method that makes use of two types of vector-like distance matrices and an inverse leading-node-finding policy.
Our method is capable of clustering non-Euclidean data such as in community detection, while outperforming the state-of-the-art counterpart methods in accuracy when clustering large Euclidean data.
arXiv Detail & Related papers (2024-06-18T06:05:45Z) - Efficient Trajectory Inference in Wasserstein Space Using Consecutive Averaging [3.8623569699070353]
Trajectory inference deals with the challenge of reconstructing continuous processes from such observations.
We propose methods for B-spline approximation of point clouds through consecutive averaging that is instrinsic to the Wasserstein space.
We rigorously evaluate our method by providing convergence guarantees and testing it on simulated cell data.
arXiv Detail & Related papers (2024-05-30T04:19:20Z) - AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs [57.12929098407975]
We show that by efficiently parallelizing existing causal discovery methods, we can scale them to thousands of dimensions.
Specifically, we focus on the causal ordering subprocedure in DirectLiNGAM and implement GPU kernels to accelerate it.
This allows us to apply DirectLiNGAM to causal inference on large-scale gene expression data with genetic interventions yielding competitive results.
arXiv Detail & Related papers (2024-03-06T15:06:11Z) - Circular Clustering with Polar Coordinate Reconstruction [6.598049778463762]
Traditional clustering algorithms are often inadequate due to their limited ability to distinguish differences in the periodic component.
We propose a new analysis framework that utilizes projections onto a cylindrical coordinate system to better represent objects in a polar coordinate system.
Our approach is generally applicable and adaptable and can be incorporated into most state-of-the-art clustering algorithms.
arXiv Detail & Related papers (2023-09-15T20:56:01Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Research on Efficient Fuzzy Clustering Method Based on Local Fuzzy
Granular balls [67.33923111887933]
In this paper, the data is fuzzy iterated using granular-balls, and the membership degree of data only considers the two granular-balls where it is located.
The formed fuzzy granular-balls set can use more processing methods in the face of different data scenarios.
arXiv Detail & Related papers (2023-03-07T01:52:55Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Combining Pseudo-Point and State Space Approximations for Sum-Separable
Gaussian Processes [48.64129867897491]
We show that there is a simple and elegant way to combine pseudo-point methods with the state space GP approximation framework to get the best of both worlds.
We demonstrate that the combined approach is more scalable and applicable to a greater range of epidemiology--temporal problems than either method on its own.
arXiv Detail & Related papers (2021-06-18T16:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.