EKM: An exact, polynomial-time algorithm for the $K$-medoids problem
- URL: http://arxiv.org/abs/2405.12237v1
- Date: Thu, 16 May 2024 15:11:37 GMT
- Title: EKM: An exact, polynomial-time algorithm for the $K$-medoids problem
- Authors: Xi He, Max A. Little,
- Abstract summary: We present EKM: a novel algorithm for solving this problem exactly with worst-case $Oleft(NK+1right)$ complexity.
EKM is developed according to recent advances in transformational programming and generation, using formal program steps.
We show that the wall-clock run time of our algorithm matches the worst-case time complexity analysis on synthetic datasets.
- Score: 1.9405875431318445
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The $K$-medoids problem is a challenging combinatorial clustering task, widely used in data analysis applications. While numerous algorithms have been proposed to solve this problem, none of these are able to obtain an exact (globally optimal) solution for the problem in polynomial time. In this paper, we present EKM: a novel algorithm for solving this problem exactly with worst-case $O\left(N^{K+1}\right)$ time complexity. EKM is developed according to recent advances in transformational programming and combinatorial generation, using formal program derivation steps. The derived algorithm is provably correct by construction. We demonstrate the effectiveness of our algorithm by comparing it against various approximate methods on numerous real-world datasets. We show that the wall-clock run time of our algorithm matches the worst-case time complexity analysis on synthetic datasets, clearly outperforming the exponential time complexity of benchmark branch-and-bound based MIP solvers. To our knowledge, this is the first, rigorously-proven polynomial time, practical algorithm for this ubiquitous problem.
Related papers
- Sum-of-Squares inspired Quantum Metaheuristic for Polynomial Optimization with the Hadamard Test and Approximate Amplitude Constraints [76.53316706600717]
Recently proposed quantum algorithm arXiv:2206.14999 is based on semidefinite programming (SDP)
We generalize the SDP-inspired quantum algorithm to sum-of-squares.
Our results show that our algorithm is suitable for large problems and approximate the best known classicals.
arXiv Detail & Related papers (2024-08-14T19:04:13Z) - Moderate Exponential-time Quantum Dynamic Programming Across the Subsets for Scheduling Problems [0.20971479389679337]
Combination of Quantum Minimum Finding and dynamic programming has proved particularly efficient in improving the complexity of NP-hard problems.
In this paper, we provide a bounded-error hybrid algorithm that achieves such an improvement for a broad class of NP-hard single-machine scheduling problems.
Our algorithm reduces the exponential-part complexity compared to the best-known classical algorithm, sometimes at the cost of an additional pseudo-polynomial factor.
arXiv Detail & Related papers (2024-08-11T10:28:49Z) - Combining K-means type algorithms with Hill Climbing for Joint
Stratification and Sample Allocation Designs [0.0]
This is a sample problem in which we search for the optimal stratification from the set of all possible stratifications of basic strata.
evaluating the cost of each solution is expensive.
We compare the above multi-stage combination of algorithms with three recent algorithms and report the solution costs, evaluation times and training times.
arXiv Detail & Related papers (2021-08-18T08:41:58Z) - Power of human-algorithm collaboration in solving combinatorial
optimization problems [0.0]
We show that a class of optimization problems can be solved efficiently in expectation up to a multiplicative factor $epsilon$ where $epsilon$ is arbitrary constant.
While our proposed methods are merely theoretical, they cast new light on how to approach solving these problems that have been usually considered intractable.
arXiv Detail & Related papers (2021-07-25T11:21:59Z) - Dynamic programming by polymorphic semiring algebraic shortcut fusion [1.9405875431318445]
Dynamic programming (DP) is an algorithmic design paradigm for the efficient, exact solution of intractable, problems.
This paper presents a rigorous algebraic formalism for systematically deriving DP algorithms, based on semiring.
arXiv Detail & Related papers (2021-07-05T00:51:02Z) - A Two-stage Framework and Reinforcement Learning-based Optimization
Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together.
The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively.
Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z) - Linear Bandit Algorithms with Sublinear Time Complexity [67.21046514005029]
We propose to accelerate existing linear bandit algorithms to achieve per-step time complexity sublinear in the number of arms $K$.
We show that our proposed algorithms can achieve $O(K1-alpha(T))$ per-step complexity for some $alpha(T) > 0$ and $widetilde O(stT)$ regret, where $T$ is the time horizon.
arXiv Detail & Related papers (2021-03-03T22:42:15Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning.
Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms.
We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z) - Learning to Accelerate Heuristic Searching for Large-Scale Maximum
Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource.
We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.