Understanding Best Subset Selection: A Tale of Two C(omplex)ities
- URL: http://arxiv.org/abs/2301.06259v2
- Date: Mon, 17 Jul 2023 17:38:45 GMT
- Title: Understanding Best Subset Selection: A Tale of Two C(omplex)ities
- Authors: Saptarshi Roy, Ambuj Tewari, Ziwei Zhu
- Abstract summary: We study the variable selection properties of best subset selection for high-dimensional sparse linear regression setup.
apart from the identifiability margin, the following two complexity measures play a fundamental role in characterizing the margin condition for model consistency.
- Score: 25.665534614984647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For decades, best subset selection (BSS) has eluded statisticians mainly due
to its computational bottleneck. However, until recently, modern computational
breakthroughs have rekindled theoretical interest in BSS and have led to new
findings. Recently, \cite{guo2020best} showed that the model selection
performance of BSS is governed by a margin quantity that is robust to the
design dependence, unlike modern methods such as LASSO, SCAD, MCP, etc.
Motivated by their theoretical results, in this paper, we also study the
variable selection properties of best subset selection for high-dimensional
sparse linear regression setup. We show that apart from the identifiability
margin, the following two complexity measures play a fundamental role in
characterizing the margin condition for model consistency: (a) complexity of
\emph{residualized features}, (b) complexity of \emph{spurious projections}. In
particular, we establish a simple margin condition that depends only on the
identifiability margin and the dominating one of the two complexity measures.
Furthermore, we show that a margin condition depending on similar margin
quantity and complexity measures is also necessary for model consistency of
BSS. For a broader understanding, we also consider some simple illustrative
examples to demonstrate the variation in the complexity measures that refines
our theoretical understanding of the model selection performance of BSS under
different correlation structures.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning [2.3072402651280517]
We study the data-generating mechanism for reconstructive SSL to shed light on its effectiveness.
With an infinite amount of labeled samples, we provide a sufficient and necessary condition for perfect linear approximation.
Motivated by the condition, we propose to approximate the redundant component by a low-rank factorization.
arXiv Detail & Related papers (2024-02-10T04:45:27Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Consistency Models for Scalable and Fast Simulation-Based Inference [9.27488642055461]
We present consistency models for posterior estimation (CMPE), a new conditional sampler for simulation-based inference ( SBI)
CMPE essentially distills a continuous probability flow and enables rapid few-shot inference with an unconstrained architecture.
Our empirical evaluation demonstrates that CMPE not only outperforms current state-of-the-art algorithms on hard low-dimensional benchmarks, but also achieves competitive performance with much faster sampling speed.
arXiv Detail & Related papers (2023-12-09T02:14:12Z) - On the Benefits of Leveraging Structural Information in Planning Over
the Learned Model [3.3512508970931236]
We investigate the benefits of leveraging structural information about the system in terms of reducing sample complexity.
Our analysis shows that there can be a significant saving in sample complexity by leveraging structural information about the model.
arXiv Detail & Related papers (2023-03-15T18:18:01Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - A General Framework for Sample-Efficient Function Approximation in
Reinforcement Learning [132.45959478064736]
We propose a general framework that unifies model-based and model-free reinforcement learning.
We propose a novel estimation function with decomposable structural properties for optimization-based exploration.
Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed.
arXiv Detail & Related papers (2022-09-30T17:59:16Z) - Optimistic MLE -- A Generic Model-based Algorithm for Partially
Observable Sequential Decision Making [48.87943416098096]
This paper introduces a simple efficient learning algorithms for general sequential decision making.
We prove that OMLE learns near-optimal policies of an enormously rich class of sequential decision making problems.
arXiv Detail & Related papers (2022-09-29T17:56:25Z) - Distributionally robust risk evaluation with a causality constraint and structural information [0.0]
We approximate test functions by neural networks and prove the sample complexity with Rademacher complexity.
Our framework outperforms the classic counterparts in the distributionally robust portfolio selection problem.
arXiv Detail & Related papers (2022-03-20T14:48:37Z) - Provable Reinforcement Learning with a Short-Term Memory [68.00677878812908]
We study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length $m$.
In particular, in the rich-observation setting, we develop new algorithms using a novel "moment matching" approach with a sample complexity that scales exponentially.
Our results show that a short-term memory suffices for reinforcement learning in these environments.
arXiv Detail & Related papers (2022-02-08T16:39:57Z) - Complexity Measures for Multi-objective Symbolic Regression [2.4087148947930634]
Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted.
We study which complexity measures are most appropriately used in symbolic regression when performing multi- objective optimization with NSGA-II.
arXiv Detail & Related papers (2021-09-01T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.