Related papers: ALARM: Automated MLLM-Based Anomaly Detection in Complex-EnviRonment Monitoring with Uncertainty Quantification

ALARM: Automated MLLM-Based Anomaly Detection in Complex-EnviRonment Monitoring with Uncertainty Quantification

URL: http://arxiv.org/abs/2512.03101v1
Date: Mon, 01 Dec 2025 19:03:14 GMT
Title: ALARM: Automated MLLM-Based Anomaly Detection in Complex-EnviRonment Monitoring with Uncertainty Quantification
Authors: Congjing Zhang, Feng Lin, Xinyi Zhao, Pei Guo, Wei Li, Lin Chen, Chaoyue Zhao, Shuai Huang,
Abstract summary: In this paper, we introduce our UQ-supported MLLM-based visual anomaly detection framework called ALARM.<n>AlARM integrates quality-assurance techniques like reasoning chain, self-reflection, and MLLM ensemble for robust and accurate performance.<n> Extensive empirical evaluations are conducted using the real-world smart-home benchmark data and wound image classification data, which shows ALARM's superior performance and its generic applicability across different domains for reliable decision-making.
Score: 16.05388703860442
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advance of Large Language Models (LLMs) has greatly stimulated research interest in developing multi-modal LLM (MLLM)-based visual anomaly detection (VAD) algorithms that can be deployed in complex environments. The challenge is that in these complex environments, the anomalies are sometimes highly contextual and also ambiguous, and thereby, uncertainty quantification (UQ) is a crucial capacity for an MLLM-based VAD system to succeed. In this paper, we introduce our UQ-supported MLLM-based VAD framework called ALARM. ALARM integrates UQ with quality-assurance techniques like reasoning chain, self-reflection, and MLLM ensemble for robust and accurate performance and is designed based on a rigorous probabilistic inference pipeline and computational process. Extensive empirical evaluations are conducted using the real-world smart-home benchmark data and wound image classification data, which shows ALARM's superior performance and its generic applicability across different domains for reliable decision-making.

Related papers

Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume [45.38219855706969]
We introduce UMPIRE, a training-free uncertainty quantification framework for Multimodal Large Language Models (MLLMs)<n>UMPIRE computes the incoherence-adjusted semantic volume of sampled MLLM responses for a given task instance.<n>We show that UMPIRE consistently outperforms baseline metrics in error detection and uncertainty calibration across image, audio, and video-text benchmarks.
arXiv Detail & Related papers (2026-02-27T17:18:42Z)
SIMBA UQ: Similarity-Based Aggregation for Uncertainty Quantification in Large Language Models [17.805673311465295]
Uncertainty quantification (UQ) provides measures of uncertainty.<n>Black-box UQ methods do not require access to internal model information.<n>We propose a high-level non-verbalized similarity-based aggregation framework.
arXiv Detail & Related papers (2025-10-10T17:22:53Z)
LLM as an Algorithmist: Enhancing Anomaly Detectors via Programmatic Synthesis [40.82779720776548]
Large Language Models (LLMs) show remarkable reasoning capabilities.<n>Our framework repositions the LLM from a data processor'' to an algorithmist''
arXiv Detail & Related papers (2025-10-04T19:00:51Z)
Exploring LLM-based Frameworks for Fault Diagnosis [2.2562573557834686]
Large Language Model (LLM)-based systems present new opportunities for autonomous health monitoring in sensor-rich industrial environments.<n>This study explores the potential of LLMs to detect and classify faults directly from sensor data, while producing inherently explainable outputs through natural language reasoning.
arXiv Detail & Related papers (2025-09-27T04:53:15Z)
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs [78.09559830840595]
We present the first systematic study on quantizing diffusion-based language models.<n>We identify the presence of activation outliers, characterized by abnormally large activation values.<n>We implement state-of-the-art PTQ methods and conduct a comprehensive evaluation.
arXiv Detail & Related papers (2025-08-20T17:59:51Z)
Agentic Reinforced Policy Optimization [66.96989268893932]
Large-scale reinforcement learning with verifiable rewards (RLVR) has demonstrated its effectiveness in harnessing the potential of large language models (LLMs) for single-turn reasoning tasks.<n>Current RL algorithms inadequately balance the models' intrinsic long-horizon reasoning capabilities and their proficiency in multi-turn tool interactions.<n>We propose Agentic Reinforced Policy Optimization (ARPO), a novel agentic RL algorithm tailored for training multi-turn LLM-based agents.
arXiv Detail & Related papers (2025-07-26T07:53:11Z)
Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey [69.45421620616486]
This work presents the first structured taxonomy and analysis of discrete tokenization methods designed for large language models (LLMs)<n>We categorize 8 representative VQ variants that span classical and modern paradigms and analyze their algorithmic principles, training dynamics, and integration challenges with LLM pipelines.<n>We identify key challenges including codebook collapse, unstable gradient estimation, and modality-specific encoding constraints.
arXiv Detail & Related papers (2025-07-21T10:52:14Z)
SPARQL Query Generation with LLMs: Measuring the Impact of Training Data Memorization and Knowledge Injection [81.78173888579941]
Large Language Models (LLMs) are considered a well-suited method to increase the quality of the question-answering functionality.<n>LLMs are trained on web data, where researchers have no control over whether the benchmark or the knowledge graph was already included in the training data.<n>This paper introduces a novel method that evaluates the quality of LLMs by generating a SPARQL query from a natural-language question.
arXiv Detail & Related papers (2025-07-18T12:28:08Z)
Uncertainty Quantification of Large Language Models through Multi-Dimensional Responses [4.505944978127014]
We introduce a multi-dimensional UQ framework that integrates semantic and knowledge-aware similarity analysis.<n>This approach disentangles overlapping information from both semantic and knowledge dimensions, capturing both semantic variations and factual consistency.<n>Our empirical evaluations demonstrate that our method outperforms existing techniques in identifying uncertain responses.
arXiv Detail & Related papers (2025-02-24T04:05:08Z)
Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models [76.17975723711886]
Uncertainty quantification (UQ) is a prominent approach for eliciting truthful answers from large language models (LLMs)<n>In this work, we adapt Mahalanobis Distance (MD) - a well-established UQ technique in classification tasks - for text generation.<n>Our method extracts token embeddings from multiple layers of LLMs, computes MD scores for each token, and uses linear regression trained on these features to provide robust uncertainty scores.
arXiv Detail & Related papers (2025-02-20T10:25:13Z)
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification is a key element of machine learning applications.<n>We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.<n>We conduct a large-scale empirical investigation of UQ and normalization techniques across eleven tasks, identifying the most effective approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.