Forest Fire Clustering: Cluster-oriented Label Propagation Clustering
and Monte Carlo Verification Inspired by Forest Fire Dynamics
- URL: http://arxiv.org/abs/2103.11802v1
- Date: Mon, 22 Mar 2021 13:02:37 GMT
- Title: Forest Fire Clustering: Cluster-oriented Label Propagation Clustering
and Monte Carlo Verification Inspired by Forest Fire Dynamics
- Authors: Zhanlin Chen, Philip Tuckman, Jing Zhang, Mark Gerstein
- Abstract summary: We introduce a novel method that could not only find robust clusters but also provide a confidence score for the labels of each data point.
Specifically, we reformulated label-propagation clustering to model after forest fire dynamics.
- Score: 4.645676097881571
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering methods group data points together and assign them group-level
labels. However, it has been difficult to evaluate the confidence of the
clustering results. Here, we introduce a novel method that could not only find
robust clusters but also provide a confidence score for the labels of each data
point. Specifically, we reformulated label-propagation clustering to model
after forest fire dynamics. The method has only one parameter - a fire
temperature term describing how easily one label propagates from one node to
the next. Through iteratively starting label propagations through a graph, we
can discover the number of clusters in a dataset with minimum prior
assumptions. Further, we can validate our predictions and uncover the posterior
probability distribution of the labels using Monte Carlo simulations. Lastly,
our iterative method is inductive and does not need to be retrained with the
arrival of new data. Here, we describe the method and provide a summary of how
the method performs against common clustering benchmarks.
Related papers
- Graph Contrastive Learning via Cluster-refined Negative Sampling for Semi-supervised Text Classification [22.476289610168056]
Graph contrastive learning (GCL) has been widely applied to text classification tasks.
Existing GCL-based text classification methods often suffer from negative sampling bias.
We propose an innovative GCL-based method of graph contrastive learning via cluster-supervised negative sampling.
arXiv Detail & Related papers (2024-10-18T16:03:49Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Local Graph Clustering with Noisy Labels [8.142265733890918]
We propose a study of local graph clustering using noisy node labels as a proxy for additional node information.
In this setting, nodes receive initial binary labels based on cluster affiliation: 1 if they belong to the target cluster and 0 otherwise.
We show that reliable node labels can be obtained with just a few samples from an attributed graph.
arXiv Detail & Related papers (2023-10-12T04:37:15Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Actively Supervised Clustering for Open Relation Extraction [42.114747195195655]
We present a novel setting, named actively supervised clustering for OpenRE.
The key to the setting is selecting which instances to label.
We propose a new strategy, which is applicable to dynamically discover clusters of unknown relations.
arXiv Detail & Related papers (2023-06-08T06:55:02Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Out-of-Distribution Detection without Class Labels [29.606812876314386]
Anomaly detection methods identify samples that deviate from the normal behavior of the dataset.
Current methods struggle when faced with training data consisting of multiple classes but no labels.
We first cluster images using self-supervised methods and obtain a cluster label for every image.
We finetune pretrained features on the task of classifying images by their cluster labels.
arXiv Detail & Related papers (2021-12-14T18:58:32Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - Predictive K-means with local models [0.028675177318965035]
Predictive clustering seeks to obtain the best of the two worlds.
We present two new algorithms using this technique and show on a variety of data sets that they are competitive for prediction performance.
arXiv Detail & Related papers (2020-12-16T10:49:36Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.