Related papers: Probabilistic ML Verification via Weighted Model Integration

Probabilistic ML Verification via Weighted Model Integration

URL: http://arxiv.org/abs/2402.04892v2
Date: Wed, 23 Oct 2024 09:04:57 GMT
Title: Probabilistic ML Verification via Weighted Model Integration
Authors: Paolo Morettin, Andrea Passerini, Roberto Sebastiani,
Abstract summary: Probability formal verification (PFV) of machine learning models is in its infancy. We propose a unifying framework for the PFV of ML systems based on Weighted Model Integration (WMI) We show how successful scaling techniques in the ML verification literature can be generalized beyond their original scope.
Score: 11.812078181471634
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In machine learning (ML) verification, the majority of procedures are non-quantitative and therefore cannot be used for verifying probabilistic models, or be applied in domains where hard guarantees are practically unachievable. The probabilistic formal verification (PFV) of ML models is in its infancy, with the existing approaches limited to specific ML models, properties, or both. This contrasts with standard formal methods techniques, whose successful adoption in real-world scenarios is also due to their support for a wide range of properties and diverse systems. We propose a unifying framework for the PFV of ML systems based on Weighted Model Integration (WMI), a relatively recent formalism for probabilistic inference with algebraic and logical constraints. Crucially, reducing the PFV of ML models to WMI enables the verification of many properties of interest over a wide range of systems, addressing multiple limitations of deterministic verification and ad-hoc algorithms. We substantiate the generality of the approach on prototypical tasks involving the verification of group fairness, monotonicity, robustness to noise, probabilistic local robustness and equivalence among predictors. We characterize the challenges related to the scalability of the approach and, through our WMI-based perspective, we show how successful scaling techniques in the ML verification literature can be generalized beyond their original scope.

Related papers

Confidence in Large Language Model Evaluation: A Bayesian Approach to Limited-Sample Challenges [13.526258635654882]
This study introduces a Bayesian approach for large language models (LLMs) capability assessment. We treat model capabilities as latent variables and leverage a curated query set to induce discriminative responses. Experimental evaluations with GPT-series models demonstrate that the proposed method achieves superior discrimination compared to conventional evaluation methods.
arXiv Detail & Related papers (2025-04-30T04:24:50Z)
Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs [12.489784979345654]
We show that current versions of large language models (LLMs) lack the ability to provide rational and coherent representations of probabilistic beliefs. We apply well-established techniques for uncertainty quantification to measure the ability of LLM's to adhere to fundamental properties of probabilistic reasoning.
arXiv Detail & Related papers (2025-04-18T11:50:30Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge. Users pay for services based on advertised model capabilities. providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
Vulnerability Detection: From Formal Verification to Large Language Models and Hybrid Approaches: A Comprehensive Overview [3.135279672650891]
This paper focuses on state-of-the-art software testing and verification. It focuses on three key approaches: classical formal methods, LLM-based analysis, and emerging hybrid techniques. We analyze whether integrating formal rigor with LLM-driven insights can enhance the effectiveness and scalability of software verification.
arXiv Detail & Related papers (2025-03-13T18:22:22Z)
Benchmarking Large and Small MLLMs [71.78055760441256]
Large multimodal language models (MLLMs) have achieved remarkable advancements in understanding and generating multimodal content. However, their deployment faces significant challenges, including slow inference, high computational cost, and impracticality for on-device applications. Small MLLMs, exemplified by the LLava-series models and Phi-3-Vision, offer promising alternatives with faster inference, reduced deployment costs, and the ability to handle domain-specific scenarios.
arXiv Detail & Related papers (2025-01-04T07:44:49Z)
Amortized Bayesian Multilevel Models [9.831471158899644]
Multilevel models (MLMs) are a central building block of the Bayesian workflow. MLMs pose significant computational challenges, often rendering their estimation and evaluation intractable within reasonable time constraints. Recent advances in simulation-based inference offer promising solutions for addressing complex probabilistic models using deep generative networks. We explore a family of neural network architectures that leverage the probabilistic factorization of multilevel models to facilitate efficient neural network training and subsequent near-instant posterior inference on unseen data sets.
arXiv Detail & Related papers (2024-08-23T17:11:04Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models [58.58594658683919]
Large multimodal models (LMMs) have shown transformative potential across various research tasks. Our findings indicate LMMs possess advantages in zero-shot learning, interpretability, and handling uncurated 'in-the-wild' inputs. We propose a Chain-of-Thought augmented prompting approach, which effectively mitigates the off-target prediction issue.
arXiv Detail & Related papers (2024-05-24T16:26:56Z)
OSM: Leveraging Model Checking for Observing Dynamic 1 behaviors in Aspect-Oriented Applications [0.0]
observe-based statistical model-checking (OSM) framework devised to craft executable formal models directly from foundational system code. This ensures the unyielding performance of electronic health record systems amidst shifting preconditions.
arXiv Detail & Related papers (2024-03-03T00:03:34Z)
Model Composition for Multimodal Large Language Models [71.5729418523411]
We propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model. Our basic implementation, NaiveMC, demonstrates the effectiveness of this paradigm by reusing modality encoders and merging LLM parameters.
arXiv Detail & Related papers (2024-02-20T06:38:10Z)
Machine learning for structure-property relationships: Scalability and limitations [3.664479980617018]
We present a scalable machine learning (ML) framework for predicting intensive properties and particularly classifying phases of many-body systems. Based on the locality assumption, ML model is developed for the prediction of intensive properties of a finite-size block. We show that the applicability of this approach depends on whether the block-size of the ML model is greater than the characteristic length scale of the system.
arXiv Detail & Related papers (2023-04-11T21:17:28Z)
Provable Generalization of Overparameterized Meta-learning Trained with SGD [62.892930625034374]
We study the generalization of a widely used meta-learning approach, Model-Agnostic Meta-Learning (MAML) We provide both upper and lower bounds for the excess risk of MAML, which captures how SGD dynamics affect these generalization bounds. Our theoretical findings are further validated by experiments.
arXiv Detail & Related papers (2022-06-18T07:22:57Z)
Algorithmic Fairness Verification with Graphical Models [24.8005399877574]
We propose an efficient fairness verifier, called FVGM, that encodes correlations among features as a Bayesian network. We show that FVGM leads to an accurate and scalable assessment for more diverse families of fairness-enhancing algorithms.
arXiv Detail & Related papers (2021-09-20T12:05:14Z)
Practical Machine Learning Safety: A Survey and Primer [81.73857913779534]
Open-world deployment of Machine Learning algorithms in safety-critical applications such as autonomous vehicles needs to address a variety of ML vulnerabilities. New models and training techniques to reduce generalization error, achieve domain adaptation, and detect outlier examples and adversarial attacks. Our organization maps state-of-the-art ML techniques to safety strategies in order to enhance the dependability of the ML algorithm from different aspects.
arXiv Detail & Related papers (2021-06-09T05:56:42Z)
Fair Bayesian Optimization [25.80374249896801]
We introduce a general constrained Bayesian optimization framework to optimize the performance of any machine learning (ML) model. We apply BO with fairness constraints to a range of popular models, including random forests, boosting, and neural networks. We show that our approach is competitive with specialized techniques that enforce model-specific fairness constraints.
arXiv Detail & Related papers (2020-06-09T08:31:08Z)
Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm. In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.