Related papers: Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information

URL: http://arxiv.org/abs/2003.05955v1
Date: Thu, 12 Mar 2020 18:04:20 GMT
Title: Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information
Authors: Esther Rolf, Michael I. Jordan, Benjamin Recht
Abstract summary: We propose a post-estimation smoothing operator as a fast and effective method for incorporating structural index data into prediction. Because the smoothing step is separate from the original predictor, it applies to a broad class of machine learning tasks. Our experiments on large scale spatial and temporal datasets highlight the speed and accuracy of post-estimation smoothing in practice.
Score: 102.18616819054368
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Observational data are often accompanied by natural structural indices, such as time stamps or geographic locations, which are meaningful to prediction tasks but are often discarded. We leverage semantically meaningful indexing data while ensuring robustness to potentially uninformative or misleading indices. We propose a post-estimation smoothing operator as a fast and effective method for incorporating structural index data into prediction. Because the smoothing step is separate from the original predictor, it applies to a broad class of machine learning tasks, with no need to retrain models. Our theoretical analysis details simple conditions under which post-estimation smoothing will improve accuracy over that of the original predictor. Our experiments on large scale spatial and temporal datasets highlight the speed and accuracy of post-estimation smoothing in practice. Together, these results illuminate a novel way to consider and incorporate the natural structure of index variables in machine learning.

Related papers

Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization [74.3339999119713]
We develop a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies. Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon.
arXiv Detail & Related papers (2024-12-06T18:22:59Z)
Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data [39.40116554523575]
We present Drift-Resilient TabPFN, a fresh approach based on In-Context Learning with a Prior-Data Fitted Network. It learns to approximate Bayesian inference on synthetic datasets drawn from a prior. It improves accuracy from 0.688 to 0.744 and ROC AUC from 0.786 to 0.832 while maintaining stronger calibration.
arXiv Detail & Related papers (2024-11-15T23:49:23Z)
Temporal Smoothness Regularisers for Neural Link Predictors [8.975480841443272]
We show that a simple method like TNTComplEx can produce significantly more accurate results than state-of-the-art methods. We also evaluate the impact of a wide range of temporal smoothing regularisers on two state-of-the-art temporal link prediction models.
arXiv Detail & Related papers (2023-09-16T16:52:49Z)
On Measuring the Intrinsic Few-Shot Hardness of Datasets [49.37562545777455]
We show that few-shot hardness may be intrinsic to datasets, for a given pre-trained model. We propose a simple and lightweight metric called "Spread" that captures the intuition that few-shot learning is made possible. Our metric better accounts for few-shot hardness compared to existing notions of hardness, and is 8-100x faster to compute.
arXiv Detail & Related papers (2022-11-16T18:53:52Z)
TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. We propose a versatile method that estimates joint distributions using an attention-based decoder. We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z)
Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation [54.899751055620904]
We study the phenomenon of memorization with turn-over dropout, an efficient method to estimate influence and memorization, for data with true labels (real data) and data with random labels (random data) Our main findings are: (i) For both real data and random data, the optimization of easy examples (e.g., real data) and difficult examples (e.g., random data) are conducted by the network simultaneously, with easy ones at a higher speed; (ii) For real data, a correct difficult example in the training dataset is more informative than an easy one.
arXiv Detail & Related papers (2021-12-16T11:34:23Z)
Convolutional Sparse Coding Fast Approximation with Application to Seismic Reflectivity Estimation [9.005280130480308]
We propose a speed-up upgraded version of the classic iterative thresholding algorithm, that produces a good approximation of the convolutional sparse code within 2-5 iterations. The performance of the proposed solution is demonstrated via the seismic inversion problem in both synthetic and real data scenarios.
arXiv Detail & Related papers (2021-06-29T12:19:07Z)
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space. We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step. We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z)
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness. The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
A machine learning approach for forecasting hierarchical time series [4.157415305926584]
We propose a machine learning approach for forecasting hierarchical time series. Forecast reconciliation is the process of adjusting forecasts to make them coherent across the hierarchy. We exploit the ability of a deep neural network to extract information capturing the structure of the hierarchy.
arXiv Detail & Related papers (2020-05-31T22:26:16Z)
Early Forecasting of Text Classification Accuracy and F-Measure with Active Learning [0.7614628596146599]
We investigate the difference in forecasting difficulty when using accuracy and F-measure as the text classification system performance metrics. We find that forecasting is easiest for decision tree learning, moderate for Support Vector Machines, and most difficult for neural networks.
arXiv Detail & Related papers (2020-01-20T06:27:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.