Understanding Best Subset Selection: A Tale of Two C(omplex)ities
- URL: http://arxiv.org/abs/2301.06259v3
- Date: Fri, 11 Apr 2025 23:51:11 GMT
- Title: Understanding Best Subset Selection: A Tale of Two C(omplex)ities
- Authors: Saptarshi Roy, Ambuj Tewari, Ziwei Zhu,
- Abstract summary: We consider the problem of best subset selection (BSS) under high-dimensional sparse linear regression model.<n>In particular, we establish both necessary and sufficient margin conditions depending on the identifiability margin and the two complexity measures.
- Score: 18.83617956033111
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of best subset selection (BSS) under high-dimensional sparse linear regression model. Recently, Guo et al. (2020) showed that the model selection performance of BSS depends on a certain identifiability margin, a measure that captures the model discriminative power of BSS under a general correlation structure that is robust to the design dependence, unlike its computational surrogates such as LASSO, SCAD, MCP, etc. Expanding on this, we further broaden the theoretical understanding of best subset selection in this paper and show that the complexities of the residualized signals, the portion of the signals orthogonal to the true active features, and spurious projections, describing the projection operators associated with the irrelevant features, also play fundamental roles in characterizing the margin condition for model consistency of BSS. In particular, we establish both necessary and sufficient margin conditions depending only on the identifiability margin and the two complexity measures. We also partially extend our sufficiency result to the case of high-dimensional sparse generalized linear models (GLMs).
Related papers
- Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention [2.8998926117101367]
State-space models (SSMs) are a new class of foundation models that have emerged as a compelling alternative to Transformers.
This paper provides a detailed theoretical analysis of selective SSMs, the core components of the Mamba and Mamba-2 architectures.
arXiv Detail & Related papers (2025-02-03T16:05:31Z) - Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality.
We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z) - Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes [37.15580574143281]
offline reinforcement learning (RL)
This paper considers the sample complexity of distributionally robust linear Markov decision processes (MDPs) with an uncertainty set characterized by the total variation distance using offline data.
We develop a pessimistic model-based algorithm and establish its sample complexity bound under minimal data coverage assumptions.
arXiv Detail & Related papers (2024-03-19T17:48:42Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning [2.3072402651280517]
We study the data-generating mechanism for reconstructive SSL to shed light on its effectiveness.
With an infinite amount of labeled samples, we provide a sufficient and necessary condition for perfect linear approximation.
Motivated by the condition, we propose to approximate the redundant component by a low-rank factorization.
arXiv Detail & Related papers (2024-02-10T04:45:27Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Consistency Models for Scalable and Fast Simulation-Based Inference [9.27488642055461]
We present consistency models for posterior estimation (CMPE), a new conditional sampler for simulation-based inference ( SBI)
CMPE essentially distills a continuous probability flow and enables rapid few-shot inference with an unconstrained architecture.
Our empirical evaluation demonstrates that CMPE not only outperforms current state-of-the-art algorithms on hard low-dimensional benchmarks, but also achieves competitive performance with much faster sampling speed.
arXiv Detail & Related papers (2023-12-09T02:14:12Z) - Latent Space Perspicacity and Interpretation Enhancement (LS-PIE)
Framework [0.0]
This paper proposes a general framework to enhance latent space representations for improving interpretability of linear latent spaces.
Although the concepts in this paper are language agnostic, the framework is written in Python.
Several innovative enhancements are incorporated including latent ranking (LR), latent scaling (LS), latent clustering (LC), and latent condensing (LCON)
arXiv Detail & Related papers (2023-07-11T03:56:04Z) - On the Benefits of Leveraging Structural Information in Planning Over
the Learned Model [3.3512508970931236]
We investigate the benefits of leveraging structural information about the system in terms of reducing sample complexity.
Our analysis shows that there can be a significant saving in sample complexity by leveraging structural information about the model.
arXiv Detail & Related papers (2023-03-15T18:18:01Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - A General Framework for Sample-Efficient Function Approximation in
Reinforcement Learning [132.45959478064736]
We propose a general framework that unifies model-based and model-free reinforcement learning.
We propose a novel estimation function with decomposable structural properties for optimization-based exploration.
Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed.
arXiv Detail & Related papers (2022-09-30T17:59:16Z) - Optimistic MLE -- A Generic Model-based Algorithm for Partially
Observable Sequential Decision Making [48.87943416098096]
This paper introduces a simple efficient learning algorithms for general sequential decision making.
We prove that OMLE learns near-optimal policies of an enormously rich class of sequential decision making problems.
arXiv Detail & Related papers (2022-09-29T17:56:25Z) - A Differential Evolution-Enhanced Latent Factor Analysis Model for
High-dimensional and Sparse Data [11.164847043777703]
This paper proposes a Sequential-Group-Differential- Evolution (SGDE) algorithm to refine the latent factors optimized by a PLFA model.
As demonstrated by the experiments on four HiDS matrices, a SGDE-PLFA model outperforms the state-of-the-art models.
arXiv Detail & Related papers (2022-04-02T13:41:19Z) - Distributionally robust risk evaluation with a causality constraint and structural information [0.0]
We approximate test functions by neural networks and prove the sample complexity with Rademacher complexity.
Our framework outperforms the classic counterparts in the distributionally robust portfolio selection problem.
arXiv Detail & Related papers (2022-03-20T14:48:37Z) - Provable Reinforcement Learning with a Short-Term Memory [68.00677878812908]
We study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length $m$.
In particular, in the rich-observation setting, we develop new algorithms using a novel "moment matching" approach with a sample complexity that scales exponentially.
Our results show that a short-term memory suffices for reinforcement learning in these environments.
arXiv Detail & Related papers (2022-02-08T16:39:57Z) - Complexity Measures for Multi-objective Symbolic Regression [2.4087148947930634]
Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted.
We study which complexity measures are most appropriately used in symbolic regression when performing multi- objective optimization with NSGA-II.
arXiv Detail & Related papers (2021-09-01T08:22:41Z) - Better Feature Integration for Named Entity Recognition [30.676768644145]
We propose a simple and robust solution to incorporate both types of features with our Synergized-LSTM (Syn-LSTM)
The results demonstrate that the proposed model achieves better performance than previous approaches while requiring fewer parameters.
arXiv Detail & Related papers (2021-04-12T09:55:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.