multivariateGPT: a decoder-only transformer for multivariate categorical and numeric data
- URL: http://arxiv.org/abs/2505.21680v2
- Date: Thu, 29 May 2025 20:51:28 GMT
- Title: multivariateGPT: a decoder-only transformer for multivariate categorical and numeric data
- Authors: Andrew J. Loza, Jun Yup Kim, Shangzheng Song, Yihang Liu, Joseph J. Y. Sung, R Andrew Taylor, Dennis L. Shung,
- Abstract summary: We present a single architecture for modeling sequences of mixed categorical (including tokenized text) and numeric data.<n>We demonstrate how this approach can efficiently learn to generalize patterns in simple physical systems and model complex time series.
- Score: 0.279207637545441
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-world processes often generate data that are a mix of categorical and numeric values that are recorded at irregular and informative intervals. Discrete token-based approaches are limited in numeric representation capacity while methods like neural ordinary differential equations are not well suited for categorical data or informative sampling and require augmentation to handle certain classes of trajectories. Here, we present multivariateGPT, a single architecture for modeling sequences of mixed categorical (including tokenized text) and numeric data. This is accomplished with an autoregressive sequence decomposition, embedding scheme, and loss function that extend the next token prediction task to likelihood estimation of the joint distribution of next token class and value. We demonstrate how this approach can efficiently learn to generalize patterns in simple physical systems and model complex time series including electrocardiograms and multivariate electronic health record data. This work extends the utility of transformer based models to additional classes of data.
Related papers
- Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time-Series Pretraining [28.595342663018627]
We make key technical contributions that are tailored to the numerical properties of time-series data.
We adopt the Transformer architecture by first partitioning the input into non-overlapping windows.
To embed scalar values that may possess arbitrary numerical amplitudes in a high-dimensional space, we propose a numerically multi-scaled embedding module.
arXiv Detail & Related papers (2023-10-11T11:38:18Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - Probabilistic Imputation for Time-series Classification with Missing
Data [17.956329906475084]
We propose a novel framework for classification with time series data with missing values.
Our deep generative model part is trained to impute the missing values in multiple plausible ways.
The classifier part takes the time series data along with the imputed missing values and classifies signals.
arXiv Detail & Related papers (2023-08-13T10:04:13Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Autoencoder Based Iterative Modeling and Multivariate Time-Series
Subsequence Clustering Algorithm [0.0]
This paper introduces an algorithm for the detection of change-points and the identification of the corresponding subsequences in transient time-series data (MTSD)
We use a recurrent neural network (RNN) based Autoencoder (AE) which is iteratively trained on incoming data.
A model of the identified subsequence is saved and used for recognition of repeating subsequences as well as fast offline clustering.
arXiv Detail & Related papers (2022-09-09T09:59:56Z) - Determination of class-specific variables in nonparametric
multiple-class classification [0.0]
We propose a probability-based nonparametric multiple-class classification method, and integrate it with the ability of identifying high impact variables for individual class.
We report the properties of the proposed method, and use both synthesized and real data sets to illustrate its properties under different classification situations.
arXiv Detail & Related papers (2022-05-07T10:08:58Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Learning from Incomplete Features by Simultaneous Training of Neural
Networks and Sparse Coding [24.3769047873156]
This paper addresses the problem of training a classifier on a dataset with incomplete features.
We assume that different subsets of features (random or structured) are available at each data instance.
A new supervised learning method is developed to train a general classifier, using only a subset of features per sample.
arXiv Detail & Related papers (2020-11-28T02:20:39Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Two-step penalised logistic regression for multi-omic data with an
application to cardiometabolic syndrome [62.997667081978825]
We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately.
Our approach should be preferred if the goal is to select as many relevant predictors as possible.
Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.
arXiv Detail & Related papers (2020-08-01T10:36:27Z) - Variational Hyper RNN for Sequence Modeling [69.0659591456772]
We propose a novel probabilistic sequence model that excels at capturing high variability in time series data.
Our method uses temporal latent variables to capture information about the underlying data pattern.
The efficacy of the proposed method is demonstrated on a range of synthetic and real-world sequential data.
arXiv Detail & Related papers (2020-02-24T19:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.