Explainable Clustering Beyond Worst-Case Guarantees
- URL: http://arxiv.org/abs/2411.01576v2
- Date: Thu, 07 Aug 2025 14:56:46 GMT
- Title: Explainable Clustering Beyond Worst-Case Guarantees
- Authors: Maximilian Fleissner, Maedeh Zarvandi, Debarghya Ghoshdastidar,
- Abstract summary: We study the explainable clustering problem first posed by Moshkovitz, Dasgupta, Rashtchian, and Frost (ICML 2020)<n>The goal of explainable clustering is to fit an axis-aligned decision tree with $K$ leaves and minimal clustering cost (where every leaf is a cluster)
- Score: 5.65604054654671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the explainable clustering problem first posed by Moshkovitz, Dasgupta, Rashtchian, and Frost (ICML 2020). The goal of explainable clustering is to fit an axis-aligned decision tree with $K$ leaves and minimal clustering cost (where every leaf is a cluster). The fundamental theoretical question in this line of work is the \textit{price of explainability}, defined as the ratio between the clustering cost of the tree and the optimal cost. Numerous papers have provided worst-case guarantees on this quantity. For $K$-medians, it has recently been shown that the worst-case price of explainability is $\Theta(\log K)$. While this settles the matter from a data-agnostic point of view, two important questions remain unanswered: Are tighter guarantees possible for well-clustered data? And can we trust decision trees to recover underlying cluster structures? In this paper, we place ourselves in a statistical setting of mixture models to answer both questions. We prove that better guarantees are indeed feasible for well-clustered data. Our algorithm takes as input a mixture model and constructs a tree in data-independent time. We then extend our analysis to kernel clustering, deriving new guarantees that significantly improve over existing worst-case bounds.
Related papers
- Near-Optimal Clustering in Mixture of Markov Chains [74.3828414695655]
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$.<n>We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains.<n>We then present a novel two-stage clustering algorithm.
arXiv Detail & Related papers (2025-06-02T05:10:40Z) - Generalization Performance of Ensemble Clustering: From Theory to Algorithm [57.176040163699554]
This paper focuses on generalization error, excess risk and consistency in ensemble clustering.<n>By assigning varying weights to finite clusterings, we minimize the error between the empirical average clusterings and their expectation.<n>We instantiate our theory to develop a new ensemble clustering algorithm.
arXiv Detail & Related papers (2025-06-01T09:34:52Z) - IsoSEL: Isometric Structural Entropy Learning for Deep Graph Clustering in Hyperbolic Space [57.036143666293334]
Graph clustering is a longstanding topic in machine learning.
In this paper, we study a challenging yet practical problem: deep graph clustering without K considering the imbalance in reality.
We present a novel IsoSEL framework for deep graph clustering, where we design a hyperbolic neural network to learn partitioning tree in the Lorentz model of hyperbolic space.
arXiv Detail & Related papers (2025-04-14T08:21:41Z) - Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data.
We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z) - Identifying General Mechanism Shifts in Linear Causal Representations [58.6238439611389]
We consider the linear causal representation learning setting where we observe a linear mixing of $d$ unknown latent factors.
Recent work has shown that it is possible to recover the latent factors as well as the underlying structural causal model over them.
We provide a surprising identifiability result that it is indeed possible, under some very mild standard assumptions, to identify the set of shifted nodes.
arXiv Detail & Related papers (2024-10-31T15:56:50Z) - Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables.
We learn the causal Bayesian network and its confounding latent variables directly from the observational data.
We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z) - A Unified Approach to Extract Interpretable Rules from Tree Ensembles via Integer Programming [2.1408617023874443]
Tree ensembles are very popular machine learning models, known for their effectiveness in supervised classification and regression tasks.
Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model that retains most of the predictive power of the full model.
Our extensive computational experiments offer statistically significant evidence that our method is competitive with other rule extraction methods in terms of predictive performance and fidelity towards the tree ensemble.
arXiv Detail & Related papers (2024-06-30T22:33:47Z) - Learning accurate and interpretable decision trees [27.203303726977616]
We develop approaches to design decision tree learning algorithms given repeated access to data from the same domain.
We study the sample complexity of tuning prior parameters in Bayesian decision tree learning, and extend our results to decision tree regression.
We also study the interpretability of the learned decision trees and introduce a data-driven approach for optimizing the explainability versus accuracy trade-off using decision trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z) - Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models.
By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity.
In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z) - Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior.
The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z) - Interpretations Steered Network Pruning via Amortized Inferred Saliency
Maps [85.49020931411825]
Convolutional Neural Networks (CNNs) compression is crucial to deploying these models in edge devices with limited resources.
We propose to address the channel pruning problem from a novel perspective by leveraging the interpretations of a model to steer the pruning process.
We tackle this challenge by introducing a selector model that predicts real-time smooth saliency masks for pruned models.
arXiv Detail & Related papers (2022-09-07T01:12:11Z) - How to Find a Good Explanation for Clustering? [7.951746797489421]
Moshkovitz, Dasgupta, Rashtchian, and Frost [ICML 2020] proposed an elegant model of explainable $k$-means and $k$-median clustering.
We study two natural algorithmic questions about explainable clustering.
Our rigorous algorithmic analysis sheds some light on the influence of parameters like the input size, dimension of the data, the number of outliers, the number of clusters, and the approximation ratio, on the computational complexity of explainable clustering.
arXiv Detail & Related papers (2021-12-13T11:48:38Z) - Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$.
Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Near-Optimal Explainable $k$-Means for All Dimensions [13.673697350508965]
We show an efficient algorithm that finds an explainable clustering whose $k$-means cost is at most $k1 - 2/dmathrmpoly(dlog k)$.
We get an improved bound of $k1 - 2/dmathrmpolylog(k)$, which we show is optimal for every choice of $k,dge 2$ up to a poly-logarithmic factor in $k$.
arXiv Detail & Related papers (2021-06-29T16:59:03Z) - Exact and Approximate Hierarchical Clustering Using A* [51.187990314731344]
We introduce a new approach based on A* search for clustering.
We overcome the prohibitively large search space by combining A* with a novel emphtrellis data structure.
We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks.
arXiv Detail & Related papers (2021-04-14T18:15:27Z) - Dive into Decision Trees and Forests: A Theoretical Demonstration [0.0]
Decision trees use the strategy of "divide-and-conquer" to divide a complex problem on the dependency between input features and labels into smaller ones.
Recent advances have greatly improved their performance in computational advertising, recommender system, information retrieval, etc.
arXiv Detail & Related papers (2021-01-20T16:47:59Z) - On the price of explainability for some clustering problems [1.52292571922932]
We provide upper and lower bounds for a natural model where explainability is achieved via decision trees.
For the $k$-means and $k$-medians problems our upper bounds improve those obtained by [Moshkovitz et. al, ICML 20] for low dimensions.
Another contribution is a simple and efficient algorithm for building explainable clusterings for the $k$-means problem.
arXiv Detail & Related papers (2021-01-05T15:08:25Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Handling Missing Data in Decision Trees: A Probabilistic Approach [41.259097100704324]
We tackle the problem of handling missing data in decision trees by taking a probabilistic approach.
We use tractable density estimators to compute the "expected prediction" of our models.
At learning time, we fine-tune parameters of already learned trees by minimizing their "expected prediction loss"
arXiv Detail & Related papers (2020-06-29T19:54:54Z) - ExKMC: Expanding Explainable $k$-Means Clustering [19.702066333512548]
We study algorithms for $k$-means clustering, focusing on a trade-off between explainability and accuracy.
We develop a new explainable $k$-means clustering algorithm, ExKMC, that takes an additional parameter $k' geq k$ and outputs a decision tree with $k'$ leaves.
ExKMC produces a low cost clustering, outperforming both standard decision tree methods and other algorithms for explainable clustering.
arXiv Detail & Related papers (2020-06-03T17:14:55Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z) - Linguistically Driven Graph Capsule Network for Visual Question
Reasoning [153.76012414126643]
We propose a hierarchical compositional reasoning model called the "Linguistically driven Graph Capsule Network"
The compositional process is guided by the linguistic parse tree. Specifically, we bind each capsule in the lowest layer to bridge the linguistic embedding of a single word in the original question with visual evidence.
Experiments on the CLEVR dataset, CLEVR compositional generation test, and FigureQA dataset demonstrate the effectiveness and composition generalization ability of our end-to-end model.
arXiv Detail & Related papers (2020-03-23T03:34:25Z) - Explainable $k$-Means and $k$-Medians Clustering [25.513261099927163]
We consider using a small decision tree to partition a data set into clusters, so that clusters can be characterized in a straightforward manner.
We show that popular top-down decision tree algorithms may lead to clusterings with arbitrarily large cost.
We design an efficient algorithm that produces explainable clusters using a tree with $k$ leaves.
arXiv Detail & Related papers (2020-02-28T04:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.