Related papers: Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining

Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining

URL: http://arxiv.org/abs/2511.13113v1
Date: Mon, 17 Nov 2025 08:08:59 GMT
Title: Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining
Authors: Zhaocheng Yu, Kui Jiang, Junjun Jiang, Xianming Liu, Guanglu Sun, Yi Xiao,
Abstract summary: Multi-Prior Hierarchical Mamba (MPHM) network for image deraining.<n>MPHM integrates macro-semantic textual priors (CLIP) for task-level semantic guidance and micro-structural visual priors (DINOv2) for scene-aware structural information.<n>Experiments demonstrate MPHM's state-of-the-art performance, achieving a 0.57 dB PSNR gain on the Rain200H dataset.
Score: 95.00432497331583
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Rain significantly degrades the performance of computer vision systems, particularly in applications like autonomous driving and video surveillance. While existing deraining methods have made considerable progress, they often struggle with fidelity of semantic and spatial details. To address these limitations, we propose the Multi-Prior Hierarchical Mamba (MPHM) network for image deraining. This novel architecture synergistically integrates macro-semantic textual priors (CLIP) for task-level semantic guidance and micro-structural visual priors (DINOv2) for scene-aware structural information. To alleviate potential conflicts between heterogeneous priors, we devise a progressive Priors Fusion Injection (PFI) that strategically injects complementary cues at different decoder levels. Meanwhile, we equip the backbone network with an elaborate Hierarchical Mamba Module (HMM) to facilitate robust feature representation, featuring a Fourier-enhanced dual-path design that concurrently addresses global context modeling and local detail recovery. Comprehensive experiments demonstrate MPHM's state-of-the-art performance, achieving a 0.57 dB PSNR gain on the Rain200H dataset while delivering superior generalization on real-world rainy scenarios.

Related papers

MammothModa2: A Unified AR-Diffusion Framework for Multimodal Understanding and Generation [20.14002849273559]
Unified multimodal models aim to integrate understanding and generation within a single framework.<n>We present MammothModa2 (Mammoth2), a unified autoregressive-diffusion (AR-Diffusion) framework.<n>Mammoth2 delivers strong text-to-image and instruction-based editing performance on public benchmarks.
arXiv Detail & Related papers (2025-11-23T03:25:39Z)
AFM-Net: Advanced Fusing Hierarchical CNN Visual Priors with Global Sequence Modeling for Remote Sensing Image Scene Classification [32.67944942908809]
AFM-Net is a novel framework that achieves effective local and global co-representation through two pathways.<n>The core innovation of AFM-Net lies in its Hierarchical Fusion Mechanism, which aggregates multi-scale features from both pathways.<n>Experiments on AID, NWPU-RESISC45, and UC Merced show that AFM-Net obtains 93.72, 95.54, and 96.92 percent accuracy, surpassing state-of-the-art methods with balanced performance and efficiency.
arXiv Detail & Related papers (2025-10-31T03:55:16Z)
SynergyNet: Fusing Generative Priors and State-Space Models for Facial Beauty Prediction [0.0]
This paper introduces the textbfMamba-Diffusion Network (MD-Net), a novel dual-stream architecture for predicting facial beauty.<n>MD-Net sets a new state-of-the-art, achieving a Pearson Correlation of textbf0.9235 and demonstrating the significant potential of hybrid architectures.
arXiv Detail & Related papers (2025-09-21T17:36:42Z)
Towards Efficient General Feature Prediction in Masked Skeleton Modeling [59.46799426434277]
We propose a novel General Feature Prediction framework (GFP) for efficient mask skeleton modeling.<n>Our key innovation is replacing conventional low-level reconstruction with high-level feature prediction that spans from local motion patterns to global semantic representations.
arXiv Detail & Related papers (2025-09-03T18:05:02Z)
GPSMamba: A Global Phase and Spectral Prompt-guided Mamba for Infrared Image Super-Resolution [7.842507196763463]
Infrared Image Super-Resolution is challenged by the low contrast and sparse textures of infrared data.<n>GPSMamba is a framework that synergizes architectural guidance with non-causal supervision.
arXiv Detail & Related papers (2025-07-25T06:56:16Z)
PMA: Towards Parameter-Efficient Point Cloud Understanding via Point Mamba Adapter [54.33433051500349]
We propose Point Mamba Adapter (PMA), which constructs an ordered feature sequence from all layers of the pre-trained model.<n>We also propose a geometry-constrained gate prompt generator (G2PG) shared across different layers.
arXiv Detail & Related papers (2025-05-27T09:27:16Z)
VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z)
An Efficient and Mixed Heterogeneous Model for Image Restoration [71.85124734060665]
Current mainstream approaches are based on three architectural paradigms: CNNs, Transformers, and Mambas.<n>We propose RestorMixer, an efficient and general-purpose IR model based on mixed-architecture fusion.
arXiv Detail & Related papers (2025-04-15T08:19:12Z)
SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.<n>We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.<n>Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z)
Multi-Scale Hourglass Hierarchical Fusion Network for Single Image Deraining [8.964751500091005]
Rain streaks bring serious blurring and visual quality degradation, which often vary in size, direction and density. Current CNN-based methods achieve encouraging performance, while are limited to depict rain characteristics and recover image details in the poor visibility environment. We present a Multi-scale Hourglass Hierarchical Fusion Network (MH2F-Net) in end-to-end manner, to exactly captures rain streak features with multi-scale extraction, hierarchical distillation and information aggregation.
arXiv Detail & Related papers (2021-04-25T08:27:01Z)
A Model-driven Deep Neural Network for Single Image Rain Removal [52.787356046951494]
We propose a model-driven deep neural network for the task, with fully interpretable network structures. Based on the convolutional dictionary learning mechanism for representing rain, we propose a novel single image deraining model. All the rain kernels and operators can be automatically extracted, faithfully characterizing the features of both rain and clean background layers.
arXiv Detail & Related papers (2020-05-04T09:13:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.