Mallows-type model averaging: Non-asymptotic analysis and all-subset combination
- URL: http://arxiv.org/abs/2505.02637v1
- Date: Mon, 05 May 2025 13:30:17 GMT
- Title: Mallows-type model averaging: Non-asymptotic analysis and all-subset combination
- Authors: Jingfu Peng,
- Abstract summary: We show that there exists a fundamental limit to achieving the optimal all-subset MA risk.<n>We propose a novel Mallows-type MA procedure based on a dimension-adaptive $C_p$ criterion.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model averaging (MA) and ensembling play a crucial role in statistical and machine learning practice. When multiple candidate models are considered, MA techniques can be used to weight and combine them, often resulting in improved predictive accuracy and better estimation stability compared to model selection (MS) methods. In this paper, we address two challenges in combining least squares estimators from both theoretical and practical perspectives. We first establish several oracle inequalities for least squares MA via minimizing a Mallows' $C_p$ criterion under an arbitrary candidate model set. Compared to existing studies, these oracle inequalities yield faster excess risk and directly imply the asymptotic optimality of the resulting MA estimators under milder conditions. Moreover, we consider candidate model construction and investigate the problem of optimal all-subset combination for least squares estimators, which is an important yet rarely discussed topic in the existing literature. We show that there exists a fundamental limit to achieving the optimal all-subset MA risk. To attain this limit, we propose a novel Mallows-type MA procedure based on a dimension-adaptive $C_p$ criterion. The implicit ensembling effects of several MS procedures are also revealed and discussed. We conduct several numerical experiments to support our theoretical findings and demonstrate the effectiveness of the proposed Mallows-type MA estimator.
Related papers
- Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models [56.92178753201331]
We tackle average-reward infinite-horizon POMDPs with an unknown transition model.<n>We present a novel and simple estimator that overcomes this barrier.
arXiv Detail & Related papers (2025-01-30T22:29:41Z) - MMD-OPT : Maximum Mean Discrepancy Based Sample Efficient Collision Risk Minimization for Autonomous Driving [2.361853878117761]
MMD-OPT is a sample-efficient approach for minimizing the risk of collision under arbitrary prediction distribution.<n>We show how these two concepts can be used to define a sample efficient surrogate for collision risk estimate.
arXiv Detail & Related papers (2024-12-12T09:57:10Z) - Ensemble Prediction via Covariate-dependent Stacking [0.0]
This study proposes a novel approach to ensemble prediction, called co-dependent stacking'' (CDST)
Unlike traditional stacking methods, CDST allows model weights to vary flexibly as a function of covariates, thereby enhancing predictive performance in complex scenarios.
Our findings suggest that the CDST is especially valuable for, but not limited to,temporal-temporal prediction problems, offering a powerful tool for researchers and practitioners in various data analysis fields.
arXiv Detail & Related papers (2024-08-19T07:31:31Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Sample-Efficient Multi-Agent RL: An Optimization Perspective [103.35353196535544]
We study multi-agent reinforcement learning (MARL) for the general-sum Markov Games (MGs) under the general function approximation.
We introduce a novel complexity measure called the Multi-Agent Decoupling Coefficient (MADC) for general-sum MGs.
We show that our algorithm provides comparable sublinear regret to the existing works.
arXiv Detail & Related papers (2023-10-10T01:39:04Z) - Model Agnostic Sample Reweighting for Out-of-Distribution Learning [38.843552982739354]
We propose a principled method, textbfAgnostic samtextbfPLe rtextbfEweighting (textbfMAPLE) to effectively address OOD problem.
Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model leads to superior OOD generalization performance.
arXiv Detail & Related papers (2023-01-24T05:11:03Z) - Adaptive Estimation of Graphical Models under Total Positivity [13.47131471222723]
We consider the problem of estimating (diagonally dominant) M-matrices as precision matrices in Gaussian graphical models.
We propose an adaptive multiple-stage estimation method that refines the estimate.
We develop a unified framework based on the gradient projection method to solve the regularized problem.
arXiv Detail & Related papers (2022-10-27T14:21:27Z) - Mitigating multiple descents: A model-agnostic framework for risk
monotonization [84.6382406922369]
We develop a general framework for risk monotonization based on cross-validation.
We propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting.
arXiv Detail & Related papers (2022-05-25T17:41:40Z) - Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems.
In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z) - Exact and Approximation Algorithms for Sparse PCA [1.7640556247739623]
This paper proposes two exact mixed-integer SDPs (MISDPs)
We then analyze the theoretical optimality gaps of their continuous relaxation values and prove that they are stronger than that of the state-of-art one.
Since off-the-shelf solvers, in general, have difficulty in solving MISDPs, we approximate SPCA with arbitrary accuracy by a mixed-integer linear program (MILP) of a similar size as MISDPs.
arXiv Detail & Related papers (2020-08-28T02:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.