Related papers: Quantum-Assisted Feature Selection for Vehicle Price Prediction Modeling

Quantum-Assisted Feature Selection for Vehicle Price Prediction Modeling

URL: http://arxiv.org/abs/2104.04049v1
Date: Thu, 8 Apr 2021 20:48:44 GMT
Title: Quantum-Assisted Feature Selection for Vehicle Price Prediction Modeling
Authors: David Von Dollen, Florian Neukart, Daniel Weimer, Thomas B\"ack
Abstract summary: We study metrics for encoding the search as a binary model, such as Generalized Mean Information Coefficient and Pearson Correlation Coefficient. We achieve accuracy scores of 0.9 for finding optimal subsets on synthetic data using a new metric that we define. Our findings show that by leveraging quantum-assisted routines we find solutions that increase the quality of predictive model output.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Within machine learning model evaluation regimes, feature selection is a technique to reduce model complexity and improve model performance in regards to generalization, model fit, and accuracy of prediction. However, the search over the space of features to find the subset of $k$ optimal features is a known NP-Hard problem. In this work, we study metrics for encoding the combinatorial search as a binary quadratic model, such as Generalized Mean Information Coefficient and Pearson Correlation Coefficient in application to the underlying regression problem of price prediction. We investigate trade-offs in the form of run-times and model performance, of leveraging quantum-assisted vs. classical subroutines for the combinatorial search, using minimum redundancy maximal relevancy as the heuristic for our approach. We achieve accuracy scores of 0.9 (in the range of [0,1]) for finding optimal subsets on synthetic data using a new metric that we define. We test and cross-validate predictive models on a real-world problem of price prediction, and show a performance improvement of mean absolute error scores for our quantum-assisted method $(1471.02 \pm{135.6})$, vs. similar methodologies such as recursive feature elimination $(1678.3 \pm{143.7})$. Our findings show that by leveraging quantum-assisted routines we find solutions that increase the quality of predictive model output while reducing the input dimensionality to the learning algorithm on synthetic and real-world data.

Related papers

Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Comparative study of regression vs pairwise models for surrogate-based heuristic optimisation [1.2535250082638645]
This paper addresses the formulation of surrogate problems as both regression models that approximate fitness (surface surrogate models) and a novel way to connect classification models (pairwise surrogate models) The performance of the overall search, when using online machine learning-based surrogate models, depends not only on the accuracy of the predictive model but also on the kind of bias towards positive or negative cases.
arXiv Detail & Related papers (2024-10-04T13:19:06Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation. Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions. We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z)
A Data-Driven State Aggregation Approach for Dynamic Discrete Choice Models [7.7347261505610865]
We present a novel algorithm that provides a data-driven method for selecting and aggregating states. The proposed two-stage approach mitigates the curse of dimensionality by reducing the problem dimension. We demonstrate the empirical performance of the algorithm in two classic dynamic discrete choice estimation applications.
arXiv Detail & Related papers (2023-04-11T01:07:24Z)
Adaptive Sparse Gaussian Process [0.0]
We propose the first adaptive sparse Gaussian Process (GP) able to address all these issues. We first reformulate a variational sparse GP algorithm to make it adaptive through a forgetting factor. We then propose updating a single inducing point of the sparse GP model together with the remaining model parameters every time a new sample arrives.
arXiv Detail & Related papers (2023-02-20T21:34:36Z)
RF+clust for Leave-One-Problem-Out Performance Prediction [0.9281671380673306]
We study leave-one-problem-out (LOPO) performance prediction. We analyze whether standard random forest (RF) model predictions can be improved by calibrating them with a weighted average of performance values.
arXiv Detail & Related papers (2023-01-23T16:14:59Z)
Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor Problem [8.281391209717105]
We study the feature-based news vendor problem, in which a decision-maker has access to historical data. In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance. We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers.
arXiv Detail & Related papers (2022-09-12T08:52:26Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
A bandit-learning approach to multifidelity approximation [7.960229223744695]
Multifidelity approximation is an important technique in scientific computation and simulation. We introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates.
arXiv Detail & Related papers (2021-03-29T05:29:35Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers. We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model. Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.