Related papers: Leveraging State Space Models in Long Range Genomics

Leveraging State Space Models in Long Range Genomics

URL: http://arxiv.org/abs/2504.06304v1
Date: Mon, 07 Apr 2025 18:34:06 GMT
Title: Leveraging State Space Models in Long Range Genomics
Authors: Matvei Popov, Aymen Kallala, Anirudha Ramesh, Narimane Hennouni, Shivesh Khaitan, Rick Gentry, Alain-Sam Cohen,
Abstract summary: Long-range dependencies are critical for understanding genomic structure and function, yet most conventional methods struggle with them.<n>We explore State Space Models (SSMs) as a promising alternative by benchmarking two SSM-inspired architectures on long-range genomics modeling tasks.<n>SSMs match transformer performance and exhibit impressive zero-shot extrapolation across multiple tasks, handling contexts 10 to 100 times longer than those seen during training.
Score: 1.130790932059036
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Long-range dependencies are critical for understanding genomic structure and function, yet most conventional methods struggle with them. Widely adopted transformer-based models, while excelling at short-context tasks, are limited by the attention module's quadratic computational complexity and inability to extrapolate to sequences longer than those seen in training. In this work, we explore State Space Models (SSMs) as a promising alternative by benchmarking two SSM-inspired architectures, Caduceus and Hawk, on long-range genomics modeling tasks under conditions parallel to a 50M parameter transformer baseline. We discover that SSMs match transformer performance and exhibit impressive zero-shot extrapolation across multiple tasks, handling contexts 10 to 100 times longer than those seen during training, indicating more generalizable representations better suited for modeling the long and complex human genome. Moreover, we demonstrate that these models can efficiently process sequences of 1M tokens on a single GPU, allowing for modeling entire genomic regions at once, even in labs with limited compute. Our findings establish SSMs as efficient and scalable for long-context genomic analysis.

Related papers

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines [64.84631333071728]
We introduce bfUnistage, a unified Transformer-based framework fortemporal modeling.<n>Our work demonstrates that a task-specific vision-text can build a generalizable model fortemporal learning.<n>We also introduce a temporal module to incorporate temporal dynamics explicitly.
arXiv Detail & Related papers (2025-03-26T17:33:23Z)
A Survey on Structured State Space Sequence (S4) Models [0.0]
Recent advancements in sequence modeling have led to the emergence of Structured State Space Models (SSMs) SSMs leverage structured recurrence and state-space representations to achieve superior long-sequence processing with linear or near-linear complexity. This survey serves as a structured guide for researchers and practitioners, detailing the advancements, trade-offs, and future directions of SSM-based architectures in AI and deep learning.
arXiv Detail & Related papers (2025-03-22T01:55:32Z)
Towards a Real-Time Simulation of Elastoplastic Deformation Using Multi-Task Neural Networks [0.0]
This study introduces a surrogate modeling framework merging proper decomposition, long short-term memory networks, and multi-task learning, to accurately predict elastoplastic deformations in real-time. The framework achieves a mean absolute error below 0.40% across various state variables. In our use cases, a pre-trained multi-task model can effectively train additional variables with as few as 20 samples, demonstrating its deep understanding of complex scenarios.
arXiv Detail & Related papers (2024-11-08T14:04:17Z)
Longhorn: State Space Models are Amortized Online Learners [51.10124201221601]
State-space models (SSMs) offer linear decoding efficiency while maintaining parallelism during training. In this work, we explore SSM design through the lens of online learning, conceptualizing SSMs as meta-modules for specific online learning problems. We introduce a novel deep SSM architecture, Longhorn, whose update resembles the closed-form solution for solving the online associative recall problem.
arXiv Detail & Related papers (2024-07-19T11:12:08Z)
Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models [5.37935922811333]
State Space Models (SSMs) are classical approaches for univariate time series modeling. We present Chimera that uses two input-dependent 2-D SSM heads with different discretization processes to learn long-term progression and seasonal patterns. Our experimental evaluation shows the superior performance of Chimera on extensive and diverse benchmarks.
arXiv Detail & Related papers (2024-06-06T17:58:09Z)
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory [63.41820940103348]
Self-attention mechanism's computational cost limits its practicality for long sequences. We propose a new method called LongVQ to compress the global abstraction as a length-fixed codebook. LongVQ effectively maintains dynamic global and local patterns, which helps to complement the lack of long-range dependency issues.
arXiv Detail & Related papers (2024-04-17T08:26:34Z)
Micro-Macro Consistency in Multiscale Modeling: Score-Based Model Assisted Sampling of Fast/Slow Dynamical Systems [0.0]
In the study of physics-based multi-time-scale dynamical systems, techniques have been developed for enhancing sampling. In the field of Machine Learning, a generic goal of generative models is to sample from a target density, after training on empirical samples from this density. In this work, we show that that SGMs can be used in such a coupling framework to improve sampling in multiscale dynamical systems.
arXiv Detail & Related papers (2023-12-10T00:46:37Z)
Convolutional State Space Models for Long-Range Spatiotemporal Modeling [65.0993000439043]
ConvS5 is an efficient variant for long-rangetemporal modeling. It significantly outperforms Transformers and ConvNISTTM on a long horizon Moving-Lab experiment while training 3X faster than ConvLSTM and generating samples 400X faster than Transformers.
arXiv Detail & Related papers (2023-10-30T16:11:06Z)
Efficiently Modeling Long Sequences with Structured State Spaces [15.456254157293836]
We propose a new sequence model based on a new parameterization for the fundamental state space model. S4 achieves strong empirical results across a diverse range of established benchmarks, including (i) 91% accuracy on sequential CIFAR-10 with no data augmentation or auxiliary losses, on par with a larger 2-D ResNet.
arXiv Detail & Related papers (2021-10-31T03:32:18Z)
S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures. We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.