Related papers: Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

URL: http://arxiv.org/abs/2408.09453v1
Date: Sun, 18 Aug 2024 12:20:03 GMT
Title: Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling
Authors: Harry Jake Cunningham, Giorgio Giannone, Mingtian Zhang, Marc Peter Deisenroth,
Abstract summary: We present a novel approach to parameterizing global convolutional kernels for long-sequence modelling. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks. We also report improved performance on ImageNet classification by replacing 2D convolutions with 1D $textttMRConv$ layers.
Score: 13.627888191693712
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Global convolutions have shown increasing promise as powerful general-purpose sequence models. However, training long convolutions is challenging, and kernel parameterizations must be able to learn long-range dependencies without overfitting. This work introduces reparameterized multi-resolution convolutions ($\texttt{MRConv}$), a novel approach to parameterizing global convolutional kernels for long-sequence modelling. By leveraging multi-resolution convolutions, incorporating structural reparameterization and introducing learnable kernel decay, $\texttt{MRConv}$ learns expressive long-range kernels that perform well across various data modalities. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks among convolution models and linear-time transformers. Moreover, we report improved performance on ImageNet classification by replacing 2D convolutions with 1D $\texttt{MRConv}$ layers.

Related papers

FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [50.438552588818]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z)
Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling [50.34513854725803]
Arbitrary-scale super-resolution (ASSR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs with arbitrary upsampling factors.<n>We propose a novel ContinuousSR framework with a Pixel-to-Gaussian paradigm, which explicitly reconstructs 2D continuous HR signals from LR images using Gaussian Splatting.
arXiv Detail & Related papers (2025-03-09T13:43:57Z)
Systems and Algorithms for Convolutional Multi-Hybrid Language Models at Scale [68.6602625868888]
We introduce convolutional multi-hybrid architectures, with a design grounded on two simple observations. Operators in hybrid models can be tailored to token manipulation tasks such as in-context recall, multi-token recall, and compression. We train end-to-end 1.2 to 2.9 times faster than optimized Transformers, and 1.1 to 1.4 times faster than previous generation hybrids.
arXiv Detail & Related papers (2025-02-25T19:47:20Z)
RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations [8.346566205092433]
RecConv is a decomposition strategy that efficiently constructs multi-frequency representations using small- Kernel convolutions. RecNeXt-M3 outperforms RepViT-M1.1 by 1.9 $APbox$ on COCO with similar FLOPs.
arXiv Detail & Related papers (2024-12-27T13:13:52Z)
PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction. We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences. We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z)
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory [63.41820940103348]
Self-attention mechanism's computational cost limits its practicality for long sequences. We propose a new method called LongVQ to compress the global abstraction as a length-fixed codebook. LongVQ effectively maintains dynamic global and local patterns, which helps to complement the lack of long-range dependency issues.
arXiv Detail & Related papers (2024-04-17T08:26:34Z)
Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. CMDPs serve as an important framework to model many real-world applications with time-varying environments. We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z)
Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly. A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks. We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z)
Sequence Modeling with Multiresolution Convolutional Memory [27.218134279968062]
We build a new building block for sequence modeling called a MultiresLayer. The key component of our model is the multiresolution convolution, capturing multiscale trends in the input sequence. Our model yields state-of-the-art performance on a number of sequence classification and autoregressive density estimation tasks.
arXiv Detail & Related papers (2023-05-02T17:50:54Z)
SMPConv: Self-moving Point Representations for Continuous Convolution [4.652175470883851]
This paper presents an alternative approach to building a continuous convolution without neural networks. We present self-moving point representations where weight parameters freely move, and schemes are used to implement continuous functions. Due to its lightweight structure, we are first to demonstrate the effectiveness of continuous convolution in a large-scale setting.
arXiv Detail & Related papers (2023-04-05T09:36:30Z)
Incorporating Transformer Designs into Convolutions for Lightweight Image Super-Resolution [46.32359056424278]
Large convolutional kernels have become popular in designing convolutional neural networks. The increase in kernel size also leads to a quadratic growth in the number of parameters, resulting in heavy computation and memory requirements. We propose a neighborhood attention (NA) module that upgrades the standard convolution with a self-attention mechanism. Building upon the NA module, we propose a lightweight single image super-resolution (SISR) network named TCSR.
arXiv Detail & Related papers (2023-03-25T01:32:18Z)
DO-Conv: Depthwise Over-parameterized Convolutional Layer [66.46704754669169]
We propose to augment a convolutional layer with an additional depthwise convolution, where each input channel is convolved with a different 2D kernel. We show with extensive experiments that the mere replacement of conventional convolutional layers with DO-Conv layers boosts the performance of CNNs.
arXiv Detail & Related papers (2020-06-22T06:57:10Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks. To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor. For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.