SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
- URL: http://arxiv.org/abs/2503.15934v1
- Date: Thu, 20 Mar 2025 08:18:27 GMT
- Title: SaMam: Style-aware State Space Model for Arbitrary Image Style Transfer
- Authors: Hongda Liu, Longguang Wang, Ye Zhang, Ziru Yu, Yulan Guo,
- Abstract summary: We develop a Mamba-based style transfer framework, termed SaMam.<n>Specifically, a mamba encoder is designed to efficiently extract content and style information.<n>To address the problems of local pixel forgetting, channel redundancy and spatial discontinuity of existing SSMs, we introduce both local enhancement and zigzag scan.
- Score: 41.09041735653436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Global effective receptive field plays a crucial role for image style transfer (ST) to obtain high-quality stylized results. However, existing ST backbones (e.g., CNNs and Transformers) suffer huge computational complexity to achieve global receptive fields. Recently, the State Space Model (SSM), especially the improved variant Mamba, has shown great potential for long-range dependency modeling with linear complexity, which offers a approach to resolve the above dilemma. In this paper, we develop a Mamba-based style transfer framework, termed SaMam. Specifically, a mamba encoder is designed to efficiently extract content and style information. In addition, a style-aware mamba decoder is developed to flexibly adapt to various styles. Moreover, to address the problems of local pixel forgetting, channel redundancy and spatial discontinuity of existing SSMs, we introduce both local enhancement and zigzag scan. Qualitative and quantitative results demonstrate that our SaMam outperforms state-of-the-art methods in terms of both accuracy and efficiency.
Related papers
- TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba [88.31117598044725]
We explore cross-architecture training to transfer the ready knowledge in existing Transformer models to alternative architecture Mamba, termed TransMamba.<n>Our approach employs a two-stage strategy to expedite training new Mamba models, ensuring effectiveness in across uni-modal and cross-modal tasks.<n>For cross-modal learning, we propose a cross-Mamba module that integrates language awareness into Mamba's visual features, enhancing the cross-modal interaction capabilities of Mamba architecture.
arXiv Detail & Related papers (2025-02-21T01:22:01Z) - MatIR: A Hybrid Mamba-Transformer Image Restoration Model [95.17418386046054]
We propose a Mamba-Transformer hybrid image restoration model called MatIR.<n>MatIR cross-cycles the blocks of the Transformer layer and the Mamba layer to extract features.<n>In the Mamba module, we introduce the Image Inpainting State Space (IRSS) module, which traverses along four scan paths.
arXiv Detail & Related papers (2025-01-30T14:55:40Z) - Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging [40.80197280147993]
We propose a Mamba-inspired Joint Unfolding Network (MiJUN) to overcome the inherent nonlinear and ill-posed characteristics of HSI reconstruction.<n>We introduce an accelerated unfolding network scheme, which reduces the reliance on initial optimization stages.<n>We refine the scanning strategy with Mamba by integrating the tensor mode-$k$ unfolding into the Mamba network.
arXiv Detail & Related papers (2025-01-02T13:56:23Z) - StyleRWKV: High-Quality and High-Efficiency Style Transfer with RWKV-like Architecture [29.178246094092202]
Style transfer aims to generate a new image preserving the content but with the artistic representation of the style source.<n>Most of the existing methods are based on Transformers or diffusion models, however, they suffer from quadratic computational complexity and high inference time.<n>We present a novel framework StyleRWKV, to achieve high-quality style transfer with limited memory usage and linear time complexity.
arXiv Detail & Related papers (2024-12-27T09:01:15Z) - Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.
Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - Deform-Mamba Network for MRI Super-Resolution [7.97504951029884]
We propose a new architecture, called Deform-Mamba, for MR image super-resolution.
We develop a Deform-Mamba encoder which is composed of two branches, modulated deform block and vision Mamba block.
Thanks to the extracted features of the encoder, which include content-adaptive local and efficient global information, the vision Mamba decoder finally generates high-quality MR images.
arXiv Detail & Related papers (2024-07-08T14:07:26Z) - Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation [16.298890431384564]
We introduce Sigma, a Siamese Mamba network for multi-modal semantic segmentation utilizing the advanced Mamba.
By employing a Siamese encoder and innovating a Mamba-based fusion mechanism, we effectively select essential information from different modalities.
Our proposed method is rigorously evaluated on both RGB-Thermal and RGB-Depth semantic segmentation tasks.
arXiv Detail & Related papers (2024-04-05T17:59:44Z) - MambaIR: A Simple Baseline for Image Restoration with State-Space Model [46.827053426281715]
We introduce MambaIR, which introduces both local enhancement and channel attention to improve the vanilla Mamba.
Our method outperforms SwinIR by up to 0.45dB on image SR, using similar computational cost but with a global receptive field.
arXiv Detail & Related papers (2024-02-23T23:15:54Z) - PointMamba: A Simple State Space Model for Point Cloud Analysis [65.59944745840866]
We propose PointMamba, transferring the success of Mamba, a recent representative state space model (SSM), from NLP to point cloud analysis tasks.
Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs.
arXiv Detail & Related papers (2024-02-16T14:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.