Metamorphic Testing for Audio Content Moderation Software
- URL: http://arxiv.org/abs/2509.24215v1
- Date: Mon, 29 Sep 2025 02:55:07 GMT
- Title: Metamorphic Testing for Audio Content Moderation Software
- Authors: Wenxuan Wang, Yongjiang Wu, Junyuan Zhang, Shuqing Li, Yun Peng, Wenting Chen, Shuai Wang, Michael R. Lyu,
- Abstract summary: We propose MTAM, a Metamorphic Testing framework for Audio content Moderation software.<n>We conduct a pilot study on 2000 audio clips and define 14 metamorphic relations across two perturbation categories: Audio Features-Based and Heuristic perturbations.<n>MTAM applies these metamorphic relations to toxic audio content to generate test cases that remain harmful while being more likely to evade detection.
- Score: 41.620212717784135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid growth of audio-centric platforms and applications such as WhatsApp and Twitter has transformed the way people communicate and share audio content in modern society. However, these platforms are increasingly misused to disseminate harmful audio content, such as hate speech, deceptive advertisements, and explicit material, which can have significant negative consequences (e.g., detrimental effects on mental health). In response, researchers and practitioners have been actively developing and deploying audio content moderation tools to tackle this issue. Despite these efforts, malicious actors can bypass moderation systems by making subtle alterations to audio content, such as modifying pitch or inserting noise. Moreover, the effectiveness of modern audio moderation tools against such adversarial inputs remains insufficiently studied. To address these challenges, we propose MTAM, a Metamorphic Testing framework for Audio content Moderation software. Specifically, we conduct a pilot study on 2000 audio clips and define 14 metamorphic relations across two perturbation categories: Audio Features-Based and Heuristic perturbations. MTAM applies these metamorphic relations to toxic audio content to generate test cases that remain harmful while being more likely to evade detection. In our evaluation, we employ MTAM to test five commercial textual content moderation software and an academic model against three kinds of toxic content. The results show that MTAM achieves up to 38.6%, 18.3%, 35.1%, 16.7%, and 51.1% error finding rates (EFR) when testing commercial moderation software provided by Gladia, Assembly AI, Baidu, Nextdata, and Tencent, respectively, and it obtains up to 45.7% EFR when testing the state-of-the-art algorithms from the academy.
Related papers
- Synthetic Voices, Real Threats: Evaluating Large Text-to-Speech Models in Generating Harmful Audio [63.18443674004945]
This work explores a content-centric threat: exploiting TTS systems to produce speech containing harmful content.<n>We present HARMGEN, a suite of five attacks organized into two families that address these challenges.
arXiv Detail & Related papers (2025-11-14T03:00:04Z) - AHELM: A Holistic Evaluation of Audio-Language Models [78.20477815156484]
multimodal audio-language models (ALMs) take interleaved audio and text as input and output text.<n>AHELM is a benchmark that aggregates various datasets -- including 2 new synthetic audio-text datasets called PARADE and CoRe-Bench.<n>We also standardize the prompts, inference parameters, and evaluation metrics to ensure equitable comparisons across models.
arXiv Detail & Related papers (2025-08-29T07:40:39Z) - Audio Does Matter: Importance-Aware Multi-Granularity Fusion for Video Moment Retrieval [58.640807985155554]
Video Moment Retrieval (VMR) aims to retrieve a specific moment semantically related to a given query.<n>Most existing VMR methods solely focus on the visual and textual modalities while neglecting the complementary but important audio modality.<n>We propose a novel Importance-aware Multi-Granularity fusion model (IMG), which learns to dynamically and selectively aggregate the audio-vision-text contexts for VMR.
arXiv Detail & Related papers (2025-08-06T09:58:43Z) - Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models [50.89022445197919]
We show that open-source audio LMMs suffer an average attack success rate of 69.14% on harmful audio questions.
Our speech-specific jailbreaks on Gemini-1.5-Pro achieve an attack success rate of 70.67% on the harmful query benchmark.
arXiv Detail & Related papers (2024-10-31T12:11:17Z) - AdVerb: Visually Guided Audio Dereverberation [49.958724234969445]
We present AdVerb, a novel audio-visual dereverberation framework.
It uses visual cues in addition to the reverberant sound to estimate clean audio.
arXiv Detail & Related papers (2023-08-23T18:20:59Z) - An Image is Worth a Thousand Toxic Words: A Metamorphic Testing
Framework for Content Moderation Software [64.367830425115]
Social media platforms are being increasingly misused to spread toxic content, including hate speech, malicious advertising, and pornography.
Despite tremendous efforts in developing and deploying content moderation methods, malicious users can evade moderation by embedding texts into images.
We propose a metamorphic testing framework for content moderation software.
arXiv Detail & Related papers (2023-08-18T20:33:06Z) - Validating Multimedia Content Moderation Software via Semantic Fusion [16.322773343799575]
We introduce Semantic Fusion, a general, effective methodology for validating multimedia content moderation software.
We employ DUO to test five commercial content moderation software and two state-of-the-art models against three kinds of toxic content.
The results show that DUO achieves up to 100% error finding rate (EFR) when testing moderation software.
arXiv Detail & Related papers (2023-05-23T02:44:15Z) - MTTM: Metamorphic Testing for Textual Content Moderation Software [11.759353169546646]
Social media platforms have been increasingly exploited to propagate toxic content.
malicious users can evade moderation by changing only a few words in the toxic content.
We propose MTTM, a Metamorphic Testing framework for Textual content Moderation software.
arXiv Detail & Related papers (2023-02-11T14:44:39Z) - Multilingual and Multimodal Abuse Detection [3.4352862428120123]
This paper attempts abuse detection in conversational audio from a multimodal perspective in a multilingual social media setting.
Our proposed method, MADA, explicitly focuses on two modalities other than the audio itself.
We test the proposed approach on 10 different languages and observe consistent gains in the range 0.6%-5.2% by leveraging multiple modalities.
arXiv Detail & Related papers (2022-04-03T13:28:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.