Consensual Aggregation on Random Projected High-dimensional Features for
Regression
- URL: http://arxiv.org/abs/2204.02606v1
- Date: Wed, 6 Apr 2022 06:35:47 GMT
- Title: Consensual Aggregation on Random Projected High-dimensional Features for
Regression
- Authors: Sothea Has (LPSM, UPMC)
- Abstract summary: We present a study of a kernel-based consensual aggregation on randomly projected high-dimensional features of predictions for regression.
We numerically illustrate that the aggregation scheme upholds its performance on very large and highly correlated features.
The efficiency of the proposed method is illustrated through several experiments evaluated on different types of synthetic and real datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a study of a kernel-based consensual aggregation on
randomly projected high-dimensional features of predictions for regression. The
aggregation scheme is composed of two steps: the high-dimensional features of
predictions, given by a large number of regression estimators, are randomly
projected into a smaller subspace using Johnson-Lindenstrauss Lemma in the
first step, and a kernel-based consensual aggregation is implemented on the
projected features in the second step. We theoretically show that the
performance of the aggregation scheme is close to the performance of the
aggregation implemented on the original high-dimensional features, with high
probability. Moreover, we numerically illustrate that the aggregation scheme
upholds its performance on very large and highly correlated features of
predictions given by different types of machines. The aggregation scheme allows
us to flexibly merge a large number of redundant machines, plainly constructed
without model selection or cross-validation. The efficiency of the proposed
method is illustrated through several experiments evaluated on different types
of synthetic and real datasets.
Related papers
- Unifying Feature and Cost Aggregation with Transformers for Semantic and Visual Correspondence [51.54175067684008]
This paper introduces a Transformer-based integrative feature and cost aggregation network designed for dense matching tasks.
We first show that feature aggregation and cost aggregation exhibit distinct characteristics and reveal the potential for substantial benefits stemming from the judicious use of both aggregation processes.
Our framework is evaluated on standard benchmarks for semantic matching, and also applied to geometric matching, where we show that our approach achieves significant improvements compared to existing methods.
arXiv Detail & Related papers (2024-03-17T07:02:55Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Parallel integrative learning for large-scale multi-response regression
with incomplete outcomes [1.7403133838762448]
In the era of big data, the coexistence of incomplete outcomes, large number of responses, and high dimensionality in predictors poses unprecedented challenges in estimation, prediction, and computation.
We propose a scalable and computationally efficient procedure, called PEER, for large-scale multi-response regression with incomplete outcomes.
Under some mild regularity conditions, we show that PEER enjoys nice sampling properties including consistency in estimation, prediction, and variable selection.
arXiv Detail & Related papers (2021-04-11T19:01:24Z) - A Forward Backward Greedy approach for Sparse Multiscale Learning [0.0]
We propose a feature driven Reproducing Kernel Hilbert space (RKHS) for which the associated kernel has a weighted multiscale structure.
For generating approximations in this space, we provide a practical forward-backward algorithm that is shown to greedily construct a set of basis functions having a multiscale structure.
We analyze the performance of the approach on a variety of simulation and real data sets.
arXiv Detail & Related papers (2021-02-14T04:22:52Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z) - Hierarchical regularization networks for sparsification based learning
on noisy datasets [0.0]
hierarchy follows from approximation spaces identified at successively finer scales.
For promoting model generalization at each scale, we also introduce a novel, projection based penalty operator across multiple dimension.
Results show the performance of the approach as a data reduction and modeling strategy on both synthetic and real datasets.
arXiv Detail & Related papers (2020-06-09T18:32:24Z) - A Hybrid Two-layer Feature Selection Method Using GeneticAlgorithm and
Elastic Net [6.85316573653194]
This paper presents a new hybrid two-layer feature selection approach that combines a wrapper and an embedded method.
The Genetic Algorithm(GA) has been adopted as a wrapper to search for the optimal subset of predictors.
A second layer is added to the proposed method to eliminate any remaining redundant/irrelevant predictors to improve the prediction accuracy.
arXiv Detail & Related papers (2020-01-30T05:01:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.