Clustering with Fast, Automated and Reproducible assessment applied to
longitudinal neural tracking
- URL: http://arxiv.org/abs/2003.08533v1
- Date: Thu, 19 Mar 2020 01:33:00 GMT
- Title: Clustering with Fast, Automated and Reproducible assessment applied to
longitudinal neural tracking
- Authors: Hanlin Zhu, Xue Li, Liuyang Sun, Fei He, Zhengtuo Zhao, Lan Luan, Ngoc
Mai Tran and Chong Xie
- Abstract summary: C-FAR is a novel method for Fast, Automated and Reproducible assessment of hierarchical clustering algorithms simultaneously.
Our algorithm takes any number of hierarchical clustering trees as input, then strategically queries pairs for human feedback, and outputs an optimal clustering among those nominated by these trees.
Our flagship application is the cluster aggregation step in spike-sorting, the task of assigning waveforms (spikes) in recordings to neurons.
- Score: 3.817161834189992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Across many areas, from neural tracking to database entity resolution, manual
assessment of clusters by human experts presents a bottleneck in rapid
development of scalable and specialized clustering methods. To solve this
problem we develop C-FAR, a novel method for Fast, Automated and Reproducible
assessment of multiple hierarchical clustering algorithms simultaneously. Our
algorithm takes any number of hierarchical clustering trees as input, then
strategically queries pairs for human feedback, and outputs an optimal
clustering among those nominated by these trees. While it is applicable to
large dataset in any domain that utilizes pairwise comparisons for assessment,
our flagship application is the cluster aggregation step in spike-sorting, the
task of assigning waveforms (spikes) in recordings to neurons. On simulated
data of 96 neurons under adverse conditions, including drifting and 25\%
blackout, our algorithm produces near-perfect tracking relative to the ground
truth. Our runtime scales linearly in the number of input trees, making it a
competitive computational tool. These results indicate that C-FAR is highly
suitable as a model selection and assessment tool in clustering tasks.
Related papers
- Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - A Machine Learning-Based Framework for Clustering Residential
Electricity Load Profiles to Enhance Demand Response Programs [0.0]
We present a novel machine learning based framework in order to achieve optimal load profiling through a real case study.
In this paper, we present a novel machine learning based framework in order to achieve optimal load profiling through a real case study.
arXiv Detail & Related papers (2023-10-31T11:23:26Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Hybridization of K-means with improved firefly algorithm for automatic
clustering in high dimension [0.0]
We have implemented the Silhouette and Elbow methods with PCA to find an optimal number of clusters.
In the Firefly algorithm, the entire population is automatically subdivided into sub-populations that decrease the convergence rate speed and trapping to local minima.
Our study proposed an enhanced firefly, i.e., a hybridized K-means with an ODFA model for automatic clustering.
arXiv Detail & Related papers (2023-02-09T18:43:10Z) - Fast conformational clustering of extensive molecular dynamics
simulation data [19.444636864515726]
We present an unsupervised data processing workflow that is specifically designed to obtain a fast conformational clustering of long trajectories.
We combine two dimensionality reduction algorithms (cc_analysis and encodermap) with a density-based spatial clustering algorithm (HDBSCAN)
With the help of four test systems we illustrate the capability and performance of this clustering workflow.
arXiv Detail & Related papers (2023-01-11T14:36:43Z) - Learning Hierarchical Graph Neural Networks for Image Clustering [81.5841862489509]
We propose a hierarchical graph neural network (GNN) model that learns how to cluster a set of images into an unknown number of identities.
Our hierarchical GNN uses a novel approach to merge connected components predicted at each level of the hierarchy to form a new graph at the next level.
arXiv Detail & Related papers (2021-07-03T01:28:42Z) - Variational Auto Encoder Gradient Clustering [0.0]
Clustering using deep neural network models have been extensively studied in recent years.
This article investigates how probability function gradient ascent can be used to process data in order to achieve better clustering.
We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm.
arXiv Detail & Related papers (2021-05-11T08:00:36Z) - DAC: Deep Autoencoder-based Clustering, a General Deep Learning
Framework of Representation Learning [0.0]
We propose DAC, Deep Autoencoder-based Clustering, a data-driven framework to learn clustering representations using deep neuron networks.
Experiment results show that our approach could effectively boost performance of the KMeans clustering algorithm on a variety of datasets.
arXiv Detail & Related papers (2021-02-15T11:31:00Z) - (k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time
Warping [57.316437798033974]
In this work we consider the problem of center-based clustering of trajectories.
We propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW)
We show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it.
arXiv Detail & Related papers (2020-12-01T13:17:27Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.