Related papers: A new heuristic algorithm for fast k-segmentation

A new heuristic algorithm for fast k-segmentation

URL: http://arxiv.org/abs/2009.05148v1
Date: Wed, 2 Sep 2020 04:50:17 GMT
Title: A new heuristic algorithm for fast k-segmentation
Authors: Sabarish Vadarevu and Vijay Karamcheti
Abstract summary: Exact and approximate methods for $k$-segmentation exist in the literature. A novel algorithm is proposed in this paper to improve upon existing methods. It is empirically found to provide accuracies competitive with exact methods at a fraction of the computational expense.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The $k$-segmentation of a video stream is used to partition it into $k$ piecewise-linear segments, so that each linear segment has a meaningful interpretation. Such segmentation may be used to summarize large videos using a small set of images, to identify anomalies within segments and change points between segments, and to select critical subsets for training machine learning models. Exact and approximate segmentation methods for $k$-segmentation exist in the literature. Each of these algorithms occupies a different spot in the trade-off between computational complexity and accuracy. A novel heuristic algorithm is proposed in this paper to improve upon existing methods. It is empirically found to provide accuracies competitive with exact methods at a fraction of the computational expense. The new algorithm is inspired by Lloyd's algorithm for K-Means and Lloyd-Max algorithm for scalar quantization, and is called the LM algorithm for convenience. It works by iteratively minimizing a cost function from any given initialisation; the commonly used $L_2$ cost is chosen in this paper. While the greedy minimization makes the algorithm sensitive to initialisation, the ability to converge from any initial guess to a local optimum allows the algorithm to be integrated into other existing algorithms. Three variants of the algorithm are tested over a large number of synthetic datasets, one being a standalone LM implementation, and two others that combine with existing algorithms. One of the latter two -- LM-enhanced-Bottom-Up segmentation -- is found to have the best accuracy and the lowest computational complexity among all algorithms. This variant of LM can provide $k$-segmentations over data sets with up to a million image frames within several seconds.

Related papers

Log-Time K-Means Clustering for 1D Data: Novel Approaches with Proof and Implementation [0.0]
This thesis bridges theory and practice for 1D $k$-means clustering, delivering efficient and sound algorithms. Benchmarks demonstrate over a 4500x speedup compared to scikit-learn for large datasets.
arXiv Detail & Related papers (2024-12-19T09:03:39Z)
GreedyML: A Parallel Algorithm for Maximizing Submodular Functions [2.9998889086656586]
We describe a parallel approximation algorithm for maximizing monotone submodular functions on distributed memory multiprocessors. Our work is motivated by the need to solve submodular optimization problems on massive data sets, for practical applications in areas such as data summarization, machine learning, and graph sparsification.
arXiv Detail & Related papers (2024-03-15T14:19:09Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
A One-Sample Decentralized Proximal Algorithm for Non-Convex Stochastic Composite Optimization [10.762749887051546]
We propose two-time scale algorithms: ProxDAS-A and Proxcal$DASA-GT. Unlike prior work, our algorithms achieve comparable complexity without requiring large batch sizes, more complex per-it operations, or stronger assumptions.
arXiv Detail & Related papers (2023-02-20T05:16:18Z)
Linearized Wasserstein dimensionality reduction with approximation guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space. We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size. We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z)
Private estimation algorithms for stochastic block models and mixture models [63.07482515700984]
General tools for designing efficient private estimation algorithms. First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z)
Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers [12.09273192079783]
We consider interactive learning in the realizable setting and develop a general framework to handle problems ranging from best arm identification to active classification. We design novel computationally efficient algorithms for the realizable setting that match the minimax lower bound up to logarithmic factors.
arXiv Detail & Related papers (2021-11-09T02:33:36Z)
Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications. We propose two new algorithms for bilevel optimization. We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z)
Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering [6.4805900740861]
We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN$*$. We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time) Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude.
arXiv Detail & Related papers (2021-04-02T16:05:00Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation [71.31905141672529]
We study the widely adopted ancestral sampling algorithms for auto-regressive language models. We identify three key properties that are shared among them: entropy reduction, order preservation, and slope preservation. We find that the set of sampling algorithms that satisfies these properties performs on par with the existing sampling algorithms.
arXiv Detail & Related papers (2020-09-15T17:28:42Z)
Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives [10.291482850329892]
Mixed integer programming (MIP) can be used to solve (to optimality) $ell_0$-regularized regression problems. We propose two classes of scalable algorithms: an exact algorithm that can handlepapprox 50,000$ features in a few minutes, and approximate algorithms that can address instances with $papprox6$. In addition, we present new estimation error bounds for $ell$-regularizeds.
arXiv Detail & Related papers (2020-01-17T18:47:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.