Cross-validation for change-point regression: pitfalls and solutions
- URL: http://arxiv.org/abs/2112.03220v3
- Date: Mon, 12 Feb 2024 13:44:32 GMT
- Title: Cross-validation for change-point regression: pitfalls and solutions
- Authors: Florian Pein and Rajen D. Shah
- Abstract summary: We show that the problems of cross-validation with squared error loss are more severe and can lead to systematic under- or over-estimation of the number of change-points.
We propose two simple approaches to remedy these issues, the first involving the use of absolute error rather than squared error loss.
We show these conditions are satisfied for at least squares estimation using new results on its performance when supplied with the incorrect number of change-points.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cross-validation is the standard approach for tuning parameter selection in
many non-parametric regression problems. However its use is less common in
change-point regression, perhaps as its prediction error-based criterion may
appear to permit small spurious changes and hence be less well-suited to
estimation of the number and location of change-points. We show that in fact
the problems of cross-validation with squared error loss are more severe and
can lead to systematic under- or over-estimation of the number of
change-points, and highly suboptimal estimation of the mean function in simple
settings where changes are easily detectable. We propose two simple approaches
to remedy these issues, the first involving the use of absolute error rather
than squared error loss, and the second involving modifying the holdout sets
used. For the latter, we provide conditions that permit consistent estimation
of the number of change-points for a general change-point estimation procedure.
We show these conditions are satisfied for least squares estimation using new
results on its performance when supplied with the incorrect number of
change-points. Numerical experiments show that our new approaches are
competitive with common change-point methods using classical tuning parameter
choices when error distributions are well-specified, but can substantially
outperform these in misspecified models. An implementation of our methodology
is available in the R package crossvalidationCP on CRAN.
Related papers
- Pathwise Gradient Variance Reduction with Control Variates in Variational Inference [2.1638817206926855]
Variational inference in Bayesian deep learning often involves computing the gradient of an expectation that lacks a closed-form solution.
In these cases, pathwise and score-function gradient estimators are the most common approaches.
Recent research suggests that even pathwise gradient estimators could benefit from variance reduction.
arXiv Detail & Related papers (2024-10-08T07:28:46Z) - Conformal Prediction via Regression-as-Classification [15.746085775084238]
We convert regression to a classification problem and then use CP for classification to obtain CP sets for regression.
Empirical results on many benchmarks shows that this simple approach gives surprisingly good results on many practical problems.
arXiv Detail & Related papers (2024-04-12T00:21:30Z) - Stability-Adjusted Cross-Validation for Sparse Linear Regression [5.156484100374059]
Cross-validation techniques like k-fold cross-validation substantially increase the computational cost of sparse regression.
We propose selecting hyper parameters that minimize a weighted sum of a cross-validation metric and a model's output stability.
Our confidence adjustment procedure reduces test set error by 2%, on average, on 13 real-world datasets.
arXiv Detail & Related papers (2023-06-26T17:02:45Z) - E-detectors: a nonparametric framework for sequential change detection [86.15115654324488]
We develop a fundamentally new and general framework for sequential change detection.
Our procedures come with clean, nonasymptotic bounds on the average run length.
We show how to design their mixtures in order to achieve both statistical and computational efficiency.
arXiv Detail & Related papers (2022-03-07T17:25:02Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Calibrated Adaptive Probabilistic ODE Solvers [31.442275669185626]
We introduce, discuss, and assess several probabilistically motivated ways to calibrate the uncertainty estimate.
We demonstrate the efficiency of the methodology by benchmarking against the classic, widely used Dormand-Prince 4/5 Runge-Kutta method.
arXiv Detail & Related papers (2020-12-15T10:48:55Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z) - Online detection of local abrupt changes in high-dimensional Gaussian
graphical models [13.554038901140949]
The problem of identifying change points in high-dimensional Gaussian graphical models (GGMs) in an online fashion is of interest, due to new applications in biology, economics and social sciences.
We develop a novel test to address this problem that is based on the $ell_infty$ norm of the normalized covariance matrix of an appropriately selected portion of incoming data.
arXiv Detail & Related papers (2020-03-16T00:41:34Z) - Optimal Change-Point Detection with Training Sequences in the Large and
Moderate Deviations Regimes [72.68201611113673]
This paper investigates a novel offline change-point detection problem from an information-theoretic perspective.
We assume that the knowledge of the underlying pre- and post-change distributions are not known and can only be learned from the training sequences which are available.
arXiv Detail & Related papers (2020-03-13T23:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.