Optimally Weighted Ensembles of Regression Models: Exact Weight
Optimization and Applications
- URL: http://arxiv.org/abs/2206.11263v1
- Date: Wed, 22 Jun 2022 09:11:14 GMT
- Title: Optimally Weighted Ensembles of Regression Models: Exact Weight
Optimization and Applications
- Authors: Patrick Echtenbruck, Martina Echtenbruck, Joost Batenburg, Thomas
B\"ack, Boris Naujoks, Michael Emmerich
- Abstract summary: We show that combining different regression models can yield better results than selecting a single ('best') regression model.
We outline an efficient method that obtains optimally weighted linear combination from a heterogeneous set of regression models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated model selection is often proposed to users to choose which machine
learning model (or method) to apply to a given regression task. In this paper,
we show that combining different regression models can yield better results
than selecting a single ('best') regression model, and outline an efficient
method that obtains optimally weighted convex linear combination from a
heterogeneous set of regression models. More specifically, in this paper, a
heuristic weight optimization, used in a preceding conference paper, is
replaced by an exact optimization algorithm using convex quadratic programming.
We prove convexity of the quadratic programming formulation for the
straightforward formulation and for a formulation with weighted data points.
The novel weight optimization is not only (more) exact but also more efficient.
The methods we develop in this paper are implemented and made available via
github-open source. They can be executed on commonly available hardware and
offer a transparent and easy to interpret interface. The results indicate that
the approach outperforms model selection methods on a range of data sets,
including data sets with mixed variable type from drug discovery applications.
Related papers
- Efficient Optimization Algorithms for Linear Adversarial Training [9.933836677441684]
Adversarial training can be used to learn models that are robust against perturbations.
We propose tailored optimization algorithms for the adversarial training of linear models.
arXiv Detail & Related papers (2024-10-16T15:41:08Z) - Adaptive Optimization for Prediction with Missing Data [6.800113478497425]
We show that some adaptive linear regression models are equivalent to learning an imputation rule and a downstream linear regression model simultaneously.
In settings where data is strongly not missing at random, our methods achieve a 2-10% improvement in out-of-sample accuracy.
arXiv Detail & Related papers (2024-02-02T16:35:51Z) - Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z) - A Consistent and Scalable Algorithm for Best Subset Selection in Single
Index Models [1.3236116985407258]
Best subset selection in high-dimensional models is known to be computationally intractable.
We propose the first provably scalable algorithm for best subset selection in high-dimensional SIMs.
Our algorithm enjoys the subset selection consistency and has the oracle property with a high probability.
arXiv Detail & Related papers (2023-09-12T13:48:06Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - A model-free feature selection technique of feature screening and random
forest based recursive feature elimination [0.0]
We propose a model-free feature selection method for ultra-high dimensional data with mass features.
We show that the proposed method is selection consistent and $L$ consistent under weak regularity conditions.
arXiv Detail & Related papers (2023-02-15T03:39:16Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Personalizing Performance Regression Models to Black-Box Optimization
Problems [0.755972004983746]
In this work, we propose a personalized regression approach for numerical optimization problems.
We also investigate the impact of selecting not a single regression model per problem, but personalized ensembles.
We test our approach on predicting the performance of numerical optimizations on the BBOB benchmark collection.
arXiv Detail & Related papers (2021-04-22T11:47:47Z) - MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment [77.38594866794429]
convex mixed-integer programming formulation for non-rigid shape matching.
We propose a novel shape deformation model based on an efficient low-dimensional discrete model.
arXiv Detail & Related papers (2020-02-28T09:54:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.