Related papers: Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation

Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation

URL: http://arxiv.org/abs/2504.05364v1
Date: Mon, 07 Apr 2025 11:51:29 GMT
Title: Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
Authors: Manvi Agarwal, Changhong Wang, Gael Richard,
Abstract summary: We present a unified framework based on kernel methods to analyze both families of efficient PEs.<n>We develop a novel PE method called RoPE, capable of extracting causal relationships from temporal sequences.<n>For empirical validation, we use a symbolic music generation task, namely, melody harmonization.
Score: 1.3108652488669736
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While music remains a challenging domain for generative models like Transformers, a two-pronged approach has recently proved successful: inserting musically-relevant structural information into the positional encoding (PE) module and using kernel approximation techniques based on Random Fourier Features (RFF) to lower the computational cost from quadratic to linear. Yet, it is not clear how such RFF-based efficient PEs compare with those based on rotation matrices, such as Rotary Positional Encoding (RoPE). In this paper, we present a unified framework based on kernel methods to analyze both families of efficient PEs. We use this framework to develop a novel PE method called RoPEPool, capable of extracting causal relationships from temporal sequences. Using RFF-based PEs and rotation-based PEs, we demonstrate how seemingly disparate PEs can be jointly studied by considering the content-context interactions they induce. For empirical validation, we use a symbolic music generation task, namely, melody harmonization. We show that RoPEPool, combined with highly-informative structural priors, outperforms all methods.

Related papers

Context-aware Rotary Position Embedding [0.0]
Rotary Positional Embeddings (RoPE) have become a widely adopted solution due to their compatibility with relative position encoding and computational efficiency.<n>We propose CARoPE (Context-Aware Rotary Positional Embedding), a novel generalization of RoPE that dynamically generates head-specific frequency patterns conditioned on token embeddings.<n>CaroPE consistently outperforms RoPE and other common positional encoding baselines, achieving significantly lower perplexity, even at longer context lengths.
arXiv Detail & Related papers (2025-07-30T20:32:19Z)
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices [25.99231204405503]
We propose ComRoPE, which generalizes Rotary Positional PE (RoPE) by defining it in terms of trainable commuting angle matrices.<n>We present two types of trainable commuting angle matrices as sufficient solutions to the RoPE equation.<n>Our framework shows versatility in generalizing to existing RoPE formulations and offering new insights for future positional encoding research.
arXiv Detail & Related papers (2025-06-04T09:10:02Z)
PaTH Attention: Position Encoding via Accumulating Householder Transformations [56.32365080761523]
PaTH is a flexible data-dependent position encoding scheme based on accumulated products of Householder transformations.<n>We derive an efficient parallel algorithm for training through exploiting a compact representation of products of Householder matrices.
arXiv Detail & Related papers (2025-05-22T08:36:09Z)
Prior Prompt Engineering for Reinforcement Fine-Tuning [16.695988860068315]
We investigate prior prompt engineering (pPE) in the context of reinforcement fine-tuning (RFT)<n>Inspired by inference-time prompt engineering (iPE), we translate five representative iPE strategies--reasoning, planning, code-based reasoning, knowledge recall, and null-example utilization--into corresponding pPE approaches.<n>Our results show that all pPE-trained models surpass their iPE-prompted counterparts.
arXiv Detail & Related papers (2025-05-20T10:05:11Z)
Unpacking Positional Encoding in Transformers: A Spectral Analysis of Content-Position Coupling [10.931433906211534]
Positional encoding (PE) is essential for enabling Transformers to model sequential structure.<n>We present a unified framework that analyzes PE through the spectral properties of Toeplitz and related matrices.<n>We establish explicit content-relative mixing with relative-position Toeplitz signals as a key principle for effective PE design.
arXiv Detail & Related papers (2025-05-19T12:11:13Z)
F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation [1.3108652488669736]
We propose F-StrIPE, a structure-informed PE scheme that works in linear complexity.<n>We illustrate the empirical merits of F-StrIPE using melody for symbolic music.
arXiv Detail & Related papers (2025-02-14T13:15:18Z)
Fast Gradient Computation for RoPE Attention in Almost Linear Time [27.28314860714307]
We develop the first almost linear time algorithm for backward computations in RoPE-based attention under bounded entries.<n>Our approach builds on recent advancements in fast RoPE attention computations.
arXiv Detail & Related papers (2024-12-23T06:20:22Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts. Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation [30.797422827190278]
We present a new PEFT method called Robust Adaptation (RoSA) inspired by robust principal component analysis. RoSA trains $textitlow-rank$ and $textithighly-sparse$ components on top of a set of fixed pretrained weights. We show that RoSA outperforms LoRA, pure sparse fine-tuning, and alternative hybrid methods at the same parameter budget.
arXiv Detail & Related papers (2024-01-09T17:09:01Z)
NPEFF: Non-Negative Per-Example Fisher Factorization [52.44573961263344]
We introduce a novel interpretability method called NPEFF that is readily applicable to any end-to-end differentiable model. We demonstrate that NPEFF has interpretable tunings through experiments on language and vision models.
arXiv Detail & Related papers (2023-10-07T02:02:45Z)
ASR: Attention-alike Structural Re-parameterization [53.019657810468026]
We propose a simple-yet-effective attention-alike structural re- parameterization (ASR) that allows us to achieve SRP for a given network while enjoying the effectiveness of the attention mechanism. In this paper, we conduct extensive experiments from a statistical perspective and discover an interesting phenomenon Stripe Observation, which reveals that channel attention values quickly approach some constant vectors during training.
arXiv Detail & Related papers (2023-04-13T08:52:34Z)
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers [71.32827362323205]
We propose a new class of linear Transformers calledLearner-Transformers (Learners) They incorporate a wide range of relative positional encoding mechanisms (RPEs) These include regular RPE techniques applied for sequential data, as well as novel RPEs operating on geometric data embedded in higher-dimensional Euclidean spaces.
arXiv Detail & Related papers (2023-02-03T18:57:17Z)
IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation [44.04110765492441]
We devise an efficient encoder-decoder based network, termed IFRNet, for fast intermediate frame synthesizing. Experiments on various benchmarks demonstrate the excellent performance and fast inference speed of proposed approaches.
arXiv Detail & Related papers (2022-05-29T10:18:18Z)
Functional Regularization for Reinforcement Learning via Learned Fourier Features [98.90474131452588]
We propose a simple architecture for deep reinforcement learning by embedding inputs into a learned Fourier basis. We show that it improves the sample efficiency of both state-based and image-based RL.
arXiv Detail & Related papers (2021-12-06T18:59:52Z)
A Generalizable Model-and-Data Driven Approach for Open-Set RFF Authentication [74.63333951647581]
Radio-frequency fingerprints(RFFs) are promising solutions for realizing low-cost physical layer authentication. Machine learning-based methods have been proposed for RFF extraction and discrimination. We propose a new end-to-end deep learning framework for extracting RFFs from raw received signals.
arXiv Detail & Related papers (2021-08-10T03:59:37Z)
Relative Positional Encoding for Transformers with Linear Complexity [30.48367640796256]
relative positional encoding (RPE) was proposed as beneficial for classical Transformers. RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix. In this paper, we present precisely what is precisely what is a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE.
arXiv Detail & Related papers (2021-05-18T09:52:32Z)
Learning to Learn Kernels with Variational Random Features [118.09565227041844]
We introduce kernels with random Fourier features in the meta-learning framework to leverage their strong few-shot learning ability. We formulate the optimization of MetaVRF as a variational inference problem. We show that MetaVRF delivers much better, or at least competitive, performance compared to existing meta-learning alternatives.
arXiv Detail & Related papers (2020-06-11T18:05:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.