Related papers: Deep Continuous-Time State-Space Models for Marked Event Sequences

Deep Continuous-Time State-Space Models for Marked Event Sequences

URL: http://arxiv.org/abs/2412.19634v2
Date: Thu, 23 Oct 2025 00:49:05 GMT
Title: Deep Continuous-Time State-Space Models for Marked Event Sequences
Authors: Yuxin Chang, Alex Boyd, Cao Xiao, Taha Kass-Hout, Parminder Bhatia, Padhraic Smyth, Andrew Warrington,
Abstract summary: Marked temporal point processes (MTPPs) model sequences of events occurring at irregular time intervals.<n>We propose the state-space point process (S2P2) model, a novel and performant model that overcomes limitations of existing MTPP models.<n>S2P2 achieves state-of-the-art predictive likelihoods across eight real-world datasets, delivering an average improvement of 33% over the best existing approaches.
Score: 32.68084329865821
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Marked temporal point processes (MTPPs) model sequences of events occurring at irregular time intervals, with wide-ranging applications in fields such as healthcare, finance and social networks. We propose the state-space point process (S2P2) model, a novel and performant model that leverages techniques derived for modern deep state-space models (SSMs) to overcome limitations of existing MTPP models, while simultaneously imbuing strong inductive biases for continuous-time event sequences that other discrete sequence models (i.e., RNNs, transformers) do not capture. Inspired by the classical linear Hawkes processes, we propose an architecture that interleaves stochastic jump differential equations with nonlinearities to create a highly expressive intensity-based MTPP model, without the need for restrictive parametric assumptions for the intensity. Our approach enables efficient training and inference with a parallel scan, bringing linear complexity and sublinear scaling while retaining expressivity to MTPPs. Empirically, S2P2 achieves state-of-the-art predictive likelihoods across eight real-world datasets, delivering an average improvement of 33% over the best existing approaches.

Related papers

Mixture of Distributions Matters: Dynamic Sparse Attention for Efficient Video Diffusion Transformers [13.366686736005699]
We present MOD-DiT, a sampling-free dynamic attention framework.<n>It accurately models evolving attention patterns through a two-stage process.<n>It overcomes the computational limitations of traditional sparse attention approaches.
arXiv Detail & Related papers (2026-01-14T16:25:39Z)
Hybrid Autoregressive-Diffusion Model for Real-Time Streaming Sign Language Production [0.0]
We introduce a hybrid approach combining autoregressive and diffusion models to generate Sign Language Production (SLP) models.<n>To capture fine-grained body movements, we design a Multi-Scale Pose Representation module that separately extracts detailed features from distinct arttors.<n>We also introduce a Confidence-Aware Causal Attention mechanism that utilizes joint-level confidence scores to dynamically guide the pose generation process.
arXiv Detail & Related papers (2025-07-12T01:34:50Z)
Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z)
Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various preconditions.<n>It converges under the sole assumption of Lipschitz continuity of the gradient on a bounded region, removing the need for other conditions commonly imposed by methods.<n>It demonstrates its superior numerical performance compared to various state-of-the-art iterations.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains. Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches. We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z)
Trajectory Flow Matching with Applications to Clinical Time Series Modeling [77.58277281319253]
Trajectory Flow Matching (TFM) trains a Neural SDE in a simulation-free manner, bypassing backpropagation through the dynamics.<n>We demonstrate improved performance on three clinical time series datasets in terms of absolute performance and uncertainty prediction.
arXiv Detail & Related papers (2024-10-28T15:54:50Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Longhorn: State Space Models are Amortized Online Learners [51.10124201221601]
State-space models (SSMs) offer linear decoding efficiency while maintaining parallelism during training. In this work, we explore SSM design through the lens of online learning, conceptualizing SSMs as meta-modules for specific online learning problems. We introduce a novel deep SSM architecture, Longhorn, whose update resembles the closed-form solution for solving the online associative recall problem.
arXiv Detail & Related papers (2024-07-19T11:12:08Z)
Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models [5.37935922811333]
State Space Models (SSMs) are classical approaches for univariate time series modeling. We present Chimera that uses two input-dependent 2-D SSM heads with different discretization processes to learn long-term progression and seasonal patterns. Our experimental evaluation shows the superior performance of Chimera on extensive and diverse benchmarks.
arXiv Detail & Related papers (2024-06-06T17:58:09Z)
Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting [26.141054975797868]
We propose a novel Adaptive Multi-Scale Decomposition (AMD) framework for time series forecasting (TSF) Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block. Our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration.
arXiv Detail & Related papers (2024-06-06T05:27:33Z)
A Poisson-Gamma Dynamic Factor Model with Time-Varying Transition Dynamics [51.147876395589925]
A non-stationary PGDS is proposed to allow the underlying transition matrices to evolve over time. A fully-conjugate and efficient Gibbs sampler is developed to perform posterior simulation. Experiments show that, in comparison with related models, the proposed non-stationary PGDS achieves improved predictive performance.
arXiv Detail & Related papers (2024-02-26T04:39:01Z)
Interacting Diffusion Processes for Event Sequence Forecasting [20.380620709345898]
We introduce a novel approach that incorporates a diffusion generative model. The model facilitates sequence-to-sequence prediction, allowing multi-step predictions based on historical event sequences. We demonstrate that our proposal outperforms state-of-the-art baselines for long-horizon forecasting of TPP.
arXiv Detail & Related papers (2023-10-26T22:17:25Z)
Generative Modeling with Phase Stochastic Bridges [49.4474628881673]
Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs. We introduce a novel generative modeling framework grounded in textbfphase space dynamics Our framework demonstrates the capability to generate realistic data points at an early stage of dynamics propagation.
arXiv Detail & Related papers (2023-10-11T18:38:28Z)
On Optimizing the Communication of Model Parallelism [74.15423270435949]
We study a novel and important communication pattern in large-scale model-parallel deep learning (DL) In cross-mesh resharding, a sharded tensor needs to be sent from a source device mesh to a destination device mesh. We propose two contributions to address cross-mesh resharding: an efficient broadcast-based communication system, and an "overlapping-friendly" pipeline schedule.
arXiv Detail & Related papers (2022-11-10T03:56:48Z)
Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision. This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z)
Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them. We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
ES-dRNN: A Hybrid Exponential Smoothing and Dilated Recurrent Neural Network Model for Short-Term Load Forecasting [1.4502611532302039]
Short-term load forecasting (STLF) is challenging due to complex time series (TS) This paper proposes a novel hybrid hierarchical deep learning model that deals with multiple seasonality. It combines exponential smoothing (ES) and a recurrent neural network (RNN)
arXiv Detail & Related papers (2021-12-05T19:38:42Z)
Time Series Forecasting Using Manifold Learning [6.316185724124034]
We address a three-tier numerical framework based on manifold learning for the forecasting of high-dimensional time series. At the first step, we embed the time series into a reduced low-dimensional space using a nonlinear manifold learning algorithm. At the second step, we construct reduced-order regression models on the manifold to forecast the embedded dynamics. At the final step, we lift the embedded time series back to the original high-dimensional space.
arXiv Detail & Related papers (2021-10-07T17:09:59Z)
Model Order Reduction based on Runge-Kutta Neural Network [0.0]
In this work, we apply some modifications for both steps respectively and investigate how they are impacted by testing with three simulation models. For the model reconstruction step, two types of neural network architectures are compared: Multilayer Perceptron (MLP) and Runge-Kutta Neural Network (RKNN)
arXiv Detail & Related papers (2021-03-25T13:02:16Z)
Learning Multivariate Hawkes Processes at Scale [17.17906360554892]
We show that our approach allows to compute the exact likelihood and gradients of an MHP -- independently of the ambient dimensions of the underlying network. We show on synthetic and real-world datasets that our model does not only achieve state-of-the-art predictive results, but also improves runtime performance by multiple orders of magnitude.
arXiv Detail & Related papers (2020-02-28T01:18:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.