A Simple Baseline for Predicting Events with Auto-Regressive Tabular Transformers
- URL: http://arxiv.org/abs/2410.10648v3
- Date: Thu, 31 Oct 2024 19:26:43 GMT
- Title: A Simple Baseline for Predicting Events with Auto-Regressive Tabular Transformers
- Authors: Alex Stein, Samuel Sharpe, Doron Bergman, Senthil Kumar, C. Bayan Bruss, John Dickerson, Tom Goldstein, Micah Goldblum,
- Abstract summary: Existing approaches to event prediction include time-aware positional embeddings, learned row and field encodings, and oversampling methods for addressing class imbalance.
We propose a simple but flexible baseline using standard autoregressive LLM-style transformers with elementary positional embeddings and a causal language modeling objective.
Our baseline outperforms existing approaches across popular datasets and can be employed for various use-cases.
- Score: 70.20477771578824
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many real-world applications of tabular data involve using historic events to predict properties of new ones, for example whether a credit card transaction is fraudulent or what rating a customer will assign a product on a retail platform. Existing approaches to event prediction include costly, brittle, and application-dependent techniques such as time-aware positional embeddings, learned row and field encodings, and oversampling methods for addressing class imbalance. Moreover, these approaches often assume specific use-cases, for example that we know the labels of all historic events or that we only predict a pre-specified label and not the data's features themselves. In this work, we propose a simple but flexible baseline using standard autoregressive LLM-style transformers with elementary positional embeddings and a causal language modeling objective. Our baseline outperforms existing approaches across popular datasets and can be employed for various use-cases. We demonstrate that the same model can predict labels, impute missing values, or model event sequences.
Related papers
- Online Performance Estimation with Unlabeled Data: A Bayesian Application of the Hui-Walter Paradigm [0.0]
We adapt the Hui-Walter paradigm, a method traditionally applied in epidemiology and medicine, to the field of machine learning.
We estimate key performance metrics such as false positive rate, false negative rate, and priors in scenarios where no ground truth is available.
We extend this paradigm for handling online data, opening up new possibilities for dynamic data environments.
arXiv Detail & Related papers (2024-01-17T17:46:10Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - Canary in a Coalmine: Better Membership Inference with Ensembled
Adversarial Queries [53.222218035435006]
We use adversarial tools to optimize for queries that are discriminative and diverse.
Our improvements achieve significantly more accurate membership inference than existing methods.
arXiv Detail & Related papers (2022-10-19T17:46:50Z) - Query-Adaptive Predictive Inference with Partial Labels [0.0]
We propose a new methodology to construct predictive sets using only partially labeled data on top of black-box predictive models.
Our experiments highlight the validity of our predictive set construction as well as the attractiveness of a more flexible user-dependent loss framework.
arXiv Detail & Related papers (2022-06-15T01:48:42Z) - Lightweight Conditional Model Extrapolation for Streaming Data under
Class-Prior Shift [27.806085423595334]
We introduce LIMES, a new method for learning with non-stationary streaming data.
We learn a single set of model parameters from which a specific classifier for any specific data distribution is derived.
Experiments on a set of exemplary tasks using Twitter data show that LIMES achieves higher accuracy than alternative approaches.
arXiv Detail & Related papers (2022-06-10T15:19:52Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - How to trust unlabeled data? Instance Credibility Inference for Few-Shot
Learning [47.21354101796544]
This paper presents a statistical approach, dubbed Instance Credibility Inference (ICI) to exploit the support of unlabeled instances for few-shot visual recognition.
We rank the credibility of pseudo-labeled instances along the regularization path of their corresponding incidental parameters, and the most trustworthy pseudo-labeled examples are preserved as the augmented labeled instances.
arXiv Detail & Related papers (2020-07-15T03:38:09Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z) - Low-Budget Label Query through Domain Alignment Enforcement [48.06803561387064]
We tackle a new problem named low-budget label query.
We first improve an Unsupervised Domain Adaptation (UDA) method to better align source and target domains.
We then propose a simple yet effective selection method based on uniform sampling of the prediction consistency distribution.
arXiv Detail & Related papers (2020-01-01T16:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.