Coarse-Grain Cluster Analysis of Tensors with Application to Climate
Biome Identification
- URL: http://arxiv.org/abs/2001.07827v2
- Date: Fri, 22 May 2020 20:49:14 GMT
- Title: Coarse-Grain Cluster Analysis of Tensors with Application to Climate
Biome Identification
- Authors: Derek DeSantis, Phillip J. Wolfram, Katrina Bennett, Boian Alexandrov
- Abstract summary: We use the discrete wavelet transform to analyze the effects of coarse-graining on clustering tensor data.
We are particularly interested in understanding how scale effects clustering of the Earth's climate system.
- Score: 0.27998963147546146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A tensor provides a concise way to codify the interdependence of complex
data. Treating a tensor as a d-way array, each entry records the interaction
between the different indices. Clustering provides a way to parse the
complexity of the data into more readily understandable information. Clustering
methods are heavily dependent on the algorithm of choice, as well as the chosen
hyperparameters of the algorithm. However, their sensitivity to data scales is
largely unknown.
In this work, we apply the discrete wavelet transform to analyze the effects
of coarse-graining on clustering tensor data. We are particularly interested in
understanding how scale effects clustering of the Earth's climate system. The
discrete wavelet transform allows classification of the Earth's climate across
a multitude of spatial-temporal scales. The discrete wavelet transform is used
to produce an ensemble of classification estimates, as opposed to a single
classification. Information theoretic approaches are used to identify important
scale lenghts in clustering The L15 Climate Datset. We also discover a
sub-collection of the ensemble that spans the majority of the variance
observed, allowing for efficient consensus clustering techniques that can be
used to identify climate biomes.
Related papers
- FLASC: A Flare-Sensitive Clustering Algorithm [0.0]
We present FLASC, an algorithm that detects branches within clusters to identify subpopulations.
Two variants of the algorithm are presented, which trade computational cost for noise robustness.
We show that both variants scale similarly to HDBSCAN* in terms of computational cost and provide stable outputs.
arXiv Detail & Related papers (2023-11-27T14:55:16Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - An adaptive granularity clustering method based on hyper-ball [11.35322380857363]
Our method is based on the idea that the data with similar distribution form a hyper-ball and the adjacent hyper-balls form a cluster.
Based on the cognitive law of "large scale first", this method can identify clusters without considering shape in a simple and non-parametric way.
arXiv Detail & Related papers (2022-05-29T07:44:09Z) - Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly
Types [60.45942774425782]
We introduce anomaly clustering, whose goal is to group data into coherent clusters of anomaly types.
This is different from anomaly detection, whose goal is to divide anomalies from normal data.
We present a simple yet effective clustering framework using a patch-based pretrained deep embeddings and off-the-shelf clustering methods.
arXiv Detail & Related papers (2021-12-21T23:11:33Z) - Unsupervised classification of simulated magnetospheric regions [0.0]
In magnetospheric missions, burst mode data sampling should be triggered in the presence of processes of scientific or operational interest.
We present an unsupervised classification method for magnetospheric regions, that could constitute the first-step of a multi-step method for the automatic identification of magnetospheric processes of interest.
arXiv Detail & Related papers (2021-09-10T14:57:32Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Swarm Intelligence for Self-Organized Clustering [6.85316573653194]
A swarm system called Databionic swarm (DBS) is introduced which is able to adapt itself to structures of high-dimensional data.
By exploiting the interrelations of swarm intelligence, self-organization and emergence, DBS serves as an alternative approach to the optimization of a global objective function in the task of clustering.
arXiv Detail & Related papers (2021-06-10T06:21:48Z) - Fuzzy clustering algorithms with distance metric learning and entropy
regularization [0.0]
This paper proposes fuzzy clustering algorithms based on Euclidean, City-block and Mahalanobis distances and entropy regularization.
Several experiments on synthetic and real datasets, including its application to noisy image texture segmentation, demonstrate the usefulness of these adaptive clustering methods.
arXiv Detail & Related papers (2021-02-18T18:19:04Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Progressive Cluster Purification for Unsupervised Feature Learning [48.87365358296371]
In unsupervised feature learning, sample specificity based methods ignore the inter-class information.
We propose a novel clustering based method, which excludes class inconsistent samples during progressive cluster formation.
Our approach, referred to as Progressive Cluster Purification (PCP), implements progressive clustering by gradually reducing the number of clusters during training.
arXiv Detail & Related papers (2020-07-06T08:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.