Frequency Spectrum is More Effective for Multimodal Representation and
Fusion: A Multimodal Spectrum Rumor Detector
- URL: http://arxiv.org/abs/2312.11023v1
- Date: Mon, 18 Dec 2023 08:55:42 GMT
- Title: Frequency Spectrum is More Effective for Multimodal Representation and
Fusion: A Multimodal Spectrum Rumor Detector
- Authors: An Lao, Qi Zhang, Chongyang Shi, Longbing Cao, Kun Yi, Liang Hu,
Duoqian Miao
- Abstract summary: Multimodal content, such as mixing text with images, presents significant challenges to rumor detection in social media.
This work makes the first attempt at multimodal rumor detection in the frequency domain, which efficiently transforms spatial features into the frequency spectrum.
A novel Frequency Spectrum Representation and fUsion network (FSRU) with dual contrastive learning reveals the frequency spectrum is more effective for multimodal representation and fusion.
- Score: 42.079129968058275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal content, such as mixing text with images, presents significant
challenges to rumor detection in social media. Existing multimodal rumor
detection has focused on mixing tokens among spatial and sequential locations
for unimodal representation or fusing clues of rumor veracity across
modalities. However, they suffer from less discriminative unimodal
representation and are vulnerable to intricate location dependencies in the
time-consuming fusion of spatial and sequential tokens. This work makes the
first attempt at multimodal rumor detection in the frequency domain, which
efficiently transforms spatial features into the frequency spectrum and obtains
highly discriminative spectrum features for multimodal representation and
fusion. A novel Frequency Spectrum Representation and fUsion network (FSRU)
with dual contrastive learning reveals the frequency spectrum is more effective
for multimodal representation and fusion, extracting the informative components
for rumor detection. FSRU involves three novel mechanisms: utilizing the
Fourier transform to convert features in the spatial domain to the frequency
domain, the unimodal spectrum compression, and the cross-modal spectrum
co-selection module in the frequency domain. Substantial experiments show that
FSRU achieves satisfactory multimodal rumor detection performance.
Related papers
- F2former: When Fractional Fourier Meets Deep Wiener Deconvolution and Selective Frequency Transformer for Image Deblurring [8.296475046681696]
We propose a novel approach based on the Fractional Fourier Transform (FRFT), a unified spatial-frequency representation.
We show that the performance of our proposed method is superior to other state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2024-09-03T17:05:12Z) - Multiple Contexts and Frequencies Aggregation Network forDeepfake Detection [5.65128683992597]
Deepfake detection faces increasing challenges since the fast growth of generative models in developing massive and diverse Deepfake technologies.
Recent advances rely on introducing features from spatial or frequency domains rather than modeling general forgery features within backbones.
We propose an efficient network for face forgery detection named MkfaNet, which consists of two core modules.
arXiv Detail & Related papers (2024-08-03T05:34:53Z) - MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection [4.165508411354963]
Event-Independent Network V2 (EINV2) has achieved outstanding performance on Sound Event localization and Detection.
This paper proposes a three-stage network structure named Multi-scale Feature Fusion (MFF) module to fully extract multi-scale features across spectral, spatial, and temporal domains.
arXiv Detail & Related papers (2024-06-13T03:03:02Z) - FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining [71.46369218331215]
Image deraining aims to remove rain streaks from rainy images and restore clear backgrounds.
We propose a new framework termed FourierMamba, which performs image deraining with Mamba in the Fourier space.
arXiv Detail & Related papers (2024-05-29T18:58:59Z) - Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - Dual-path Frequency Discriminators for Few-shot Anomaly Detection [12.956761809902167]
We propose a Dual-Path Frequency Discriminators (DFD) network from a frequency perspective to tackle these issues.
The discriminators learn a joint representation with forms of pseudo-anomalies.
Experiments conducted on MVTec AD and VisA benchmarks demonstrate that our DFD surpasses current state-of-the-art methods.
arXiv Detail & Related papers (2024-03-07T02:17:59Z) - A Dual Domain Multi-exposure Image Fusion Network based on the
Spatial-Frequency Integration [57.14745782076976]
Multi-exposure image fusion aims to generate a single high-dynamic image by integrating images with different exposures.
We propose a novelty perspective on multi-exposure image fusion via the Spatial-Frequency Integration Framework, named MEF-SFI.
Our method achieves visual-appealing fusion results against state-of-the-art multi-exposure image fusion approaches.
arXiv Detail & Related papers (2023-12-17T04:45:15Z) - Deep Fourier Up-Sampling [100.59885545206744]
Up-sampling in the Fourier domain is more challenging as it does not follow such a local property.
We propose a theoretically sound Deep Fourier Up-Sampling (FourierUp) to solve these issues.
arXiv Detail & Related papers (2022-10-11T06:17:31Z) - Multi-Scale Wavelet Transformer for Face Forgery Detection [43.33712402517951]
We propose a multi-scale wavelet transformer framework for face forgery detection.
Frequency-based spatial attention is designed to guide the spatial feature extractor to concentrate more on forgery traces.
Cross-modality attention is proposed to fuse the frequency features with the spatial features.
arXiv Detail & Related papers (2022-10-08T03:39:36Z) - Adaptive Frequency Learning in Two-branch Face Forgery Detection [66.91715092251258]
We propose Adaptively learn Frequency information in the two-branch Detection framework, dubbed AFD.
We liberate our network from the fixed frequency transforms, and achieve better performance with our data- and task-dependent transform layers.
arXiv Detail & Related papers (2022-03-27T14:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.