Post-Estimation Smoothing: A Simple Baseline for Learning with Side
Information
- URL: http://arxiv.org/abs/2003.05955v1
- Date: Thu, 12 Mar 2020 18:04:20 GMT
- Title: Post-Estimation Smoothing: A Simple Baseline for Learning with Side
Information
- Authors: Esther Rolf, Michael I. Jordan, Benjamin Recht
- Abstract summary: We propose a post-estimation smoothing operator as a fast and effective method for incorporating structural index data into prediction.
Because the smoothing step is separate from the original predictor, it applies to a broad class of machine learning tasks.
Our experiments on large scale spatial and temporal datasets highlight the speed and accuracy of post-estimation smoothing in practice.
- Score: 102.18616819054368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Observational data are often accompanied by natural structural indices, such
as time stamps or geographic locations, which are meaningful to prediction
tasks but are often discarded. We leverage semantically meaningful indexing
data while ensuring robustness to potentially uninformative or misleading
indices. We propose a post-estimation smoothing operator as a fast and
effective method for incorporating structural index data into prediction.
Because the smoothing step is separate from the original predictor, it applies
to a broad class of machine learning tasks, with no need to retrain models. Our
theoretical analysis details simple conditions under which post-estimation
smoothing will improve accuracy over that of the original predictor. Our
experiments on large scale spatial and temporal datasets highlight the speed
and accuracy of post-estimation smoothing in practice. Together, these results
illuminate a novel way to consider and incorporate the natural structure of
index variables in machine learning.
Related papers
- Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data [39.40116554523575]
We present Drift-Resilient TabPFN, a fresh approach based on In-Context Learning with a Prior-Data Fitted Network.
It learns to approximate Bayesian inference on synthetic datasets drawn from a prior.
It improves accuracy from 0.688 to 0.744 and ROC AUC from 0.786 to 0.832 while maintaining stronger calibration.
arXiv Detail & Related papers (2024-11-15T23:49:23Z) - Temporal Smoothness Regularisers for Neural Link Predictors [8.975480841443272]
We show that a simple method like TNTComplEx can produce significantly more accurate results than state-of-the-art methods.
We also evaluate the impact of a wide range of temporal smoothing regularisers on two state-of-the-art temporal link prediction models.
arXiv Detail & Related papers (2023-09-16T16:52:49Z) - On Measuring the Intrinsic Few-Shot Hardness of Datasets [49.37562545777455]
We show that few-shot hardness may be intrinsic to datasets, for a given pre-trained model.
We propose a simple and lightweight metric called "Spread" that captures the intuition that few-shot learning is made possible.
Our metric better accounts for few-shot hardness compared to existing notions of hardness, and is 8-100x faster to compute.
arXiv Detail & Related papers (2022-11-16T18:53:52Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Understanding Memorization from the Perspective of Optimization via
Efficient Influence Estimation [54.899751055620904]
We study the phenomenon of memorization with turn-over dropout, an efficient method to estimate influence and memorization, for data with true labels (real data) and data with random labels (random data)
Our main findings are: (i) For both real data and random data, the optimization of easy examples (e.g., real data) and difficult examples (e.g., random data) are conducted by the network simultaneously, with easy ones at a higher speed; (ii) For real data, a correct difficult example in the training dataset is more informative than an easy one.
arXiv Detail & Related papers (2021-12-16T11:34:23Z) - Convolutional Sparse Coding Fast Approximation with Application to
Seismic Reflectivity Estimation [9.005280130480308]
We propose a speed-up upgraded version of the classic iterative thresholding algorithm, that produces a good approximation of the convolutional sparse code within 2-5 iterations.
The performance of the proposed solution is demonstrated via the seismic inversion problem in both synthetic and real data scenarios.
arXiv Detail & Related papers (2021-06-29T12:19:07Z) - Representation Learning for Sequence Data with Deep Autoencoding
Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step.
We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - A machine learning approach for forecasting hierarchical time series [4.157415305926584]
We propose a machine learning approach for forecasting hierarchical time series.
Forecast reconciliation is the process of adjusting forecasts to make them coherent across the hierarchy.
We exploit the ability of a deep neural network to extract information capturing the structure of the hierarchy.
arXiv Detail & Related papers (2020-05-31T22:26:16Z) - Early Forecasting of Text Classification Accuracy and F-Measure with
Active Learning [0.7614628596146599]
We investigate the difference in forecasting difficulty when using accuracy and F-measure as the text classification system performance metrics.
We find that forecasting is easiest for decision tree learning, moderate for Support Vector Machines, and most difficult for neural networks.
arXiv Detail & Related papers (2020-01-20T06:27:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.