Related papers: Certifiably Polynomial Algorithm for Best Group Subset Selection

Certifiably Polynomial Algorithm for Best Group Subset Selection

URL: http://arxiv.org/abs/2104.12576v1
Date: Fri, 23 Apr 2021 03:05:11 GMT
Title: Certifiably Polynomial Algorithm for Best Group Subset Selection
Authors: Yanhang Zhang, Junxian Zhu, Jin Zhu, Xueqin Wang
Abstract summary: Best group subset selection aims to choose a small part of non-overlapping groups to achieve the best interpretability on the response variable. We propose a group-splicing algorithm that iteratively detects effective groups and excludes the helpless ones. We demonstrate the efficiency and accuracy of our proposal by comparing state-of-the-art algorithms on both synthetic and real-world datasets.
Score: 0.9667631210393929
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Best group subset selection aims to choose a small part of non-overlapping groups to achieve the best interpretability on the response variable. It is practically attractive for group variable selection; however, due to the computational intractability in high dimensionality setting, it doesn't catch enough attention. To fill the blank of efficient algorithms for best group subset selection, in this paper, we propose a group-splicing algorithm that iteratively detects effective groups and excludes the helpless ones. Moreover, coupled with a novel Bayesian group information criterion, an adaptive algorithm is developed to determine the true group subset size. It is certifiable that our algorithms enable identifying the optimal group subset in polynomial time under mild conditions. We demonstrate the efficiency and accuracy of our proposal by comparing state-of-the-art algorithms on both synthetic and real-world datasets.

Related papers

Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization [52.80408805368928]
We introduce a novel greedy-style subset selection algorithm for batch acquisition. Our experiments on the red fluorescent proteins show that our proposed method achieves the baseline performance in 1.69x fewer queries.
arXiv Detail & Related papers (2024-06-21T05:57:08Z)
Concomitant Group Testing [49.50984893039441]
We introduce a variation of the group testing problem capturing the idea that a positive test requires a combination of multiple types'' of item. The goal is to reliably identify all of the semi-defective sets using as few tests as possible. Our algorithms are distinguished by (i) whether they are deterministic (zero-error) or randomized (small-error), and (ii) whether they are non-adaptive, fully adaptive, or have limited adaptivity.
arXiv Detail & Related papers (2023-09-08T09:11:12Z)
Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms [17.57585822765145]
We propose an exact algorithm that is suitable for small datasets as well as a $frac1-varepsilon integer integer5$-approximation algorithm for any $varepsilon in (0, 1)$ that scales to large datasets. Experiments on real-world datasets demonstrate the superior performance of our proposed algorithms over existing ones.
arXiv Detail & Related papers (2023-01-05T13:02:35Z)
Exclusive Group Lasso for Structured Variable Selection [10.86544864007391]
A structured variable selection problem is considered. A composite norm can be properly designed to promote such exclusive group sparsity patterns. An active set algorithm is proposed that builds the solution by including structure atoms into the estimated support.
arXiv Detail & Related papers (2021-08-23T16:55:13Z)
Clustering-Based Subset Selection in Evolutionary Multiobjective Optimization [11.110675371854988]
Subset selection is an important component in evolutionary multiobjective optimization (EMO) algorithms. Clustering-based methods have not been evaluated in the context of subset selection from solution sets obtained by EMO algorithms.
arXiv Detail & Related papers (2021-08-19T02:56:41Z)
Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment. Policy gradients for local search are often obtained from random perturbations. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z)
Fast Greedy Subset Selection from Large Candidate Solution Sets in Evolutionary Multi-objective Optimization [11.110675371854988]
We discuss the efficiency of greedy subset selection for the hypervolume, IGD and IGD+ indicators. Our idea is to use the submodular property, which is known for the hypervolume indicator, to improve their efficiency.
arXiv Detail & Related papers (2021-02-01T16:14:15Z)
Multi-View Spectral Clustering with High-Order Optimal Neighborhood Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data. This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix. Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z)
Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms. We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z)
Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback. We devise an algorithm with a minimal cluster recovery error rate. For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.