Quantum-Assisted Feature Selection for Vehicle Price Prediction Modeling
- URL: http://arxiv.org/abs/2104.04049v1
- Date: Thu, 8 Apr 2021 20:48:44 GMT
- Title: Quantum-Assisted Feature Selection for Vehicle Price Prediction Modeling
- Authors: David Von Dollen, Florian Neukart, Daniel Weimer, Thomas B\"ack
- Abstract summary: We study metrics for encoding the search as a binary model, such as Generalized Mean Information Coefficient and Pearson Correlation Coefficient.
We achieve accuracy scores of 0.9 for finding optimal subsets on synthetic data using a new metric that we define.
Our findings show that by leveraging quantum-assisted routines we find solutions that increase the quality of predictive model output.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Within machine learning model evaluation regimes, feature selection is a
technique to reduce model complexity and improve model performance in regards
to generalization, model fit, and accuracy of prediction. However, the search
over the space of features to find the subset of $k$ optimal features is a
known NP-Hard problem. In this work, we study metrics for encoding the
combinatorial search as a binary quadratic model, such as Generalized Mean
Information Coefficient and Pearson Correlation Coefficient in application to
the underlying regression problem of price prediction. We investigate
trade-offs in the form of run-times and model performance, of leveraging
quantum-assisted vs. classical subroutines for the combinatorial search, using
minimum redundancy maximal relevancy as the heuristic for our approach. We
achieve accuracy scores of 0.9 (in the range of [0,1]) for finding optimal
subsets on synthetic data using a new metric that we define. We test and
cross-validate predictive models on a real-world problem of price prediction,
and show a performance improvement of mean absolute error scores for our
quantum-assisted method $(1471.02 \pm{135.6})$, vs. similar methodologies such
as recursive feature elimination $(1678.3 \pm{143.7})$. Our findings show that
by leveraging quantum-assisted routines we find solutions that increase the
quality of predictive model output while reducing the input dimensionality to
the learning algorithm on synthetic and real-world data.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Comparative study of regression vs pairwise models for surrogate-based heuristic optimisation [1.2535250082638645]
This paper addresses the formulation of surrogate problems as both regression models that approximate fitness (surface surrogate models) and a novel way to connect classification models (pairwise surrogate models)
The performance of the overall search, when using online machine learning-based surrogate models, depends not only on the accuracy of the predictive model but also on the kind of bias towards positive or negative cases.
arXiv Detail & Related papers (2024-10-04T13:19:06Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Adaptive Sparse Gaussian Process [0.0]
We propose the first adaptive sparse Gaussian Process (GP) able to address all these issues.
We first reformulate a variational sparse GP algorithm to make it adaptive through a forgetting factor.
We then propose updating a single inducing point of the sparse GP model together with the remaining model parameters every time a new sample arrives.
arXiv Detail & Related papers (2023-02-20T21:34:36Z) - RF+clust for Leave-One-Problem-Out Performance Prediction [0.9281671380673306]
We study leave-one-problem-out (LOPO) performance prediction.
We analyze whether standard random forest (RF) model predictions can be improved by calibrating them with a weighted average of performance values.
arXiv Detail & Related papers (2023-01-23T16:14:59Z) - Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor
Problem [8.281391209717105]
We study the feature-based news vendor problem, in which a decision-maker has access to historical data.
In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance.
We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers.
arXiv Detail & Related papers (2022-09-12T08:52:26Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - A bandit-learning approach to multifidelity approximation [7.960229223744695]
Multifidelity approximation is an important technique in scientific computation and simulation.
We introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates.
arXiv Detail & Related papers (2021-03-29T05:29:35Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.