Related papers: Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights

Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights

URL: http://arxiv.org/abs/2406.17430v1
Date: Tue, 25 Jun 2024 10:08:45 GMT
Title: Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights
Authors: Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari,
Abstract summary: We propose a speech-specific risk taxonomy, covering 8 risk categories under hostility (malicious sarcasm and threats), malicious imitation (age, gender, ethnicity), and stereotypical biases (age, gender, ethnicity) Based on the taxonomy, we create a small-scale dataset for evaluating current LMMs capability in detecting these categories of risk.
Score: 50.89022445197919
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Multimodal Models (LMMs) have achieved great success recently, demonstrating a strong capability to understand multimodal information and to interact with human users. Despite the progress made, the challenge of detecting high-risk interactions in multimodal settings, and in particular in speech modality, remains largely unexplored. Conventional research on risk for speech modality primarily emphasises the content (e.g., what is captured as transcription). However, in speech-based interactions, paralinguistic cues in audio can significantly alter the intended meaning behind utterances. In this work, we propose a speech-specific risk taxonomy, covering 8 risk categories under hostility (malicious sarcasm and threats), malicious imitation (age, gender, ethnicity), and stereotypical biases (age, gender, ethnicity). Based on the taxonomy, we create a small-scale dataset for evaluating current LMMs capability in detecting these categories of risk. We observe even the latest models remain ineffective to detect various paralinguistic-specific risks in speech (e.g., Gemini 1.5 Pro is performing only slightly above random baseline). Warning: this paper contains biased and offensive examples.

Related papers

MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks [85.3303135160762]
MIRAGE is a novel framework that exploits narrative-driven context and role immersion to circumvent safety mechanisms in Multimodal Large Language Models. It achieves state-of-the-art performance, improving attack success rates by up to 17.5% over the best baselines. We demonstrate that role immersion and structured semantic reconstruction can activate inherent model biases, facilitating the model's spontaneous violation of ethical safeguards.
arXiv Detail & Related papers (2025-03-24T20:38:42Z)
Extreme Speech Classification in the Era of LLMs: Exploring Open-Source and Proprietary Models [0.30693357740321775]
ChatGPT has drawn global attention to the potential applications of Large Language Models (LLMs) We leverage the Indian subset of the extreme speech dataset from Maronikolakis et al. (2022) to develop an effective classification framework using LLMs. We evaluate open-source Llama models against closed-source OpenAI models, finding that while pre-trained LLMs show moderate efficacy, fine-tuning with domain-specific data significantly enhances performance.
arXiv Detail & Related papers (2025-02-21T02:31:05Z)
Unmasking Conversational Bias in AI Multiagent Systems [1.0705399532413618]
biases that may arise in multi-agent systems involving generative models remain under-researched. We present a framework designed to quantify biases within multi-agent systems of conversational Large Language Models. The bias observed in the echo-chamber experiment remains undetected by current state-of-the-art bias detection methods.
arXiv Detail & Related papers (2025-01-24T09:10:02Z)
On the Privacy Risk of In-context Learning [36.633860818454984]
We show that deploying prompted models presents a significant privacy risk for the data used within the prompt. We also observe that the privacy risk of prompted models exceeds fine-tuned models at the same utility levels.
arXiv Detail & Related papers (2024-11-15T17:11:42Z)
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs) By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases. The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z)
Imposter.AI: Adversarial Attacks with Hidden Intentions towards Aligned Large Language Models [13.225041704917905]
This study unveils an attack mechanism that capitalizes on human conversation strategies to extract harmful information from large language models. Unlike conventional methods that target explicit malicious responses, our approach delves deeper into the nature of the information provided in responses.
arXiv Detail & Related papers (2024-07-22T06:04:29Z)
MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries? [70.77691645678804]
Humans are prone to cognitive distortions -- biased thinking patterns that lead to exaggerated responses to specific stimuli. This paper demonstrates that advanced Multimodal Large Language Models (MLLMs) exhibit similar tendencies. We identify three types of stimuli that trigger the oversensitivity of existing MLLMs: Exaggerated Risk, Negated Harm, and Counterintuitive.
arXiv Detail & Related papers (2024-06-22T23:26:07Z)
Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey [46.19229410404056]
Large language models (LLMs) have made remarkable advancements in natural language processing. These models are trained on vast datasets to exhibit powerful language understanding and generation capabilities. Privacy and security issues have been revealed throughout their life cycle.
arXiv Detail & Related papers (2024-06-12T07:55:32Z)
SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models [63.946809247201905]
We introduce a new benchmark, namely SHIELD, to evaluate the ability of MLLMs on face spoofing and forgery detection. We design true/false and multiple-choice questions to evaluate multimodal face data in these two face security tasks. The results indicate that MLLMs hold substantial potential in the face security domain.
arXiv Detail & Related papers (2024-02-06T17:31:36Z)
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey [50.58063811745676]
This work provides a survey of practical methods for addressing potential threats and societal harms from language generation models. We draw on several prior works' of language model risks to present a structured overview of strategies for detecting and ameliorating different kinds of risks/harms of language generators.
arXiv Detail & Related papers (2022-10-14T10:43:39Z)
Two steps to risk sensitivity [4.974890682815778]
conditional value-at-risk (CVaR) is a risk measure for modeling human and animal planning. We adopt a conventional distributional approach to CVaR in a sequential setting and reanalyze the choices of human decision-makers. We then consider a further critical property of risk sensitivity, namely time consistency, showing alternatives to this form of CVaR.
arXiv Detail & Related papers (2021-11-12T16:27:47Z)
Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models [86.02610674750345]
Adversarial GLUE (AdvGLUE) is a new multi-task benchmark to explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks. We apply 14 adversarial attack methods to GLUE tasks to construct AdvGLUE, which is further validated by humans for reliable annotations. All the language models and robust training methods we tested perform poorly on AdvGLUE, with scores lagging far behind the benign accuracy.
arXiv Detail & Related papers (2021-11-04T12:59:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.