Certifiably Polynomial Algorithm for Best Group Subset Selection
- URL: http://arxiv.org/abs/2104.12576v1
- Date: Fri, 23 Apr 2021 03:05:11 GMT
- Title: Certifiably Polynomial Algorithm for Best Group Subset Selection
- Authors: Yanhang Zhang, Junxian Zhu, Jin Zhu, Xueqin Wang
- Abstract summary: Best group subset selection aims to choose a small part of non-overlapping groups to achieve the best interpretability on the response variable.
We propose a group-splicing algorithm that iteratively detects effective groups and excludes the helpless ones.
We demonstrate the efficiency and accuracy of our proposal by comparing state-of-the-art algorithms on both synthetic and real-world datasets.
- Score: 0.9667631210393929
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Best group subset selection aims to choose a small part of non-overlapping
groups to achieve the best interpretability on the response variable. It is
practically attractive for group variable selection; however, due to the
computational intractability in high dimensionality setting, it doesn't catch
enough attention. To fill the blank of efficient algorithms for best group
subset selection, in this paper, we propose a group-splicing algorithm that
iteratively detects effective groups and excludes the helpless ones. Moreover,
coupled with a novel Bayesian group information criterion, an adaptive
algorithm is developed to determine the true group subset size. It is
certifiable that our algorithms enable identifying the optimal group subset in
polynomial time under mild conditions. We demonstrate the efficiency and
accuracy of our proposal by comparing state-of-the-art algorithms on both
synthetic and real-world datasets.
Related papers
- Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization [52.80408805368928]
We introduce a novel greedy-style subset selection algorithm for batch acquisition.
Our experiments on the red fluorescent proteins show that our proposed method achieves the baseline performance in 1.69x fewer queries.
arXiv Detail & Related papers (2024-06-21T05:57:08Z) - Concomitant Group Testing [49.50984893039441]
We introduce a variation of the group testing problem capturing the idea that a positive test requires a combination of multiple types'' of item.
The goal is to reliably identify all of the semi-defective sets using as few tests as possible.
Our algorithms are distinguished by (i) whether they are deterministic (zero-error) or randomized (small-error), and (ii) whether they are non-adaptive, fully adaptive, or have limited adaptivity.
arXiv Detail & Related papers (2023-09-08T09:11:12Z) - Max-Min Diversification with Fairness Constraints: Exact and
Approximation Algorithms [17.57585822765145]
We propose an exact algorithm that is suitable for small datasets as well as a $frac1-varepsilon integer integer5$-approximation algorithm for any $varepsilon in (0, 1)$ that scales to large datasets.
Experiments on real-world datasets demonstrate the superior performance of our proposed algorithms over existing ones.
arXiv Detail & Related papers (2023-01-05T13:02:35Z) - Exclusive Group Lasso for Structured Variable Selection [10.86544864007391]
A structured variable selection problem is considered.
A composite norm can be properly designed to promote such exclusive group sparsity patterns.
An active set algorithm is proposed that builds the solution by including structure atoms into the estimated support.
arXiv Detail & Related papers (2021-08-23T16:55:13Z) - Clustering-Based Subset Selection in Evolutionary Multiobjective
Optimization [11.110675371854988]
Subset selection is an important component in evolutionary multiobjective optimization (EMO) algorithms.
Clustering-based methods have not been evaluated in the context of subset selection from solution sets obtained by EMO algorithms.
arXiv Detail & Related papers (2021-08-19T02:56:41Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Fast Greedy Subset Selection from Large Candidate Solution Sets in
Evolutionary Multi-objective Optimization [11.110675371854988]
We discuss the efficiency of greedy subset selection for the hypervolume, IGD and IGD+ indicators.
Our idea is to use the submodular property, which is known for the hypervolume indicator, to improve their efficiency.
arXiv Detail & Related papers (2021-02-01T16:14:15Z) - Multi-View Spectral Clustering with High-Order Optimal Neighborhood
Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data.
This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix.
Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z) - Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms.
We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.