Quantifying perturbation impacts for large language models
- URL: http://arxiv.org/abs/2412.00868v1
- Date: Sun, 01 Dec 2024 16:13:09 GMT
- Title: Quantifying perturbation impacts for large language models
- Authors: Paulius Rauba, Qiyao Wei, Mihaela van der Schaar,
- Abstract summary: We introduce Distribution-Based Perturbation Analysis (DBPA), a framework that reformulates perturbation analysis as a frequentist hypothesis testing problem.
We demonstrate the effectiveness of DBPA in evaluating perturbation impacts, showing its versatility for perturbation analysis.
- Score: 49.1574468325115
- License:
- Abstract: We consider the problem of quantifying how an input perturbation impacts the outputs of large language models (LLMs), a fundamental task for model reliability and post-hoc interpretability. A key obstacle in this domain is disentangling the meaningful changes in model responses from the intrinsic stochasticity of LLM outputs. To overcome this, we introduce Distribution-Based Perturbation Analysis (DBPA), a framework that reformulates LLM perturbation analysis as a frequentist hypothesis testing problem. DBPA constructs empirical null and alternative output distributions within a low-dimensional semantic similarity space via Monte Carlo sampling. Comparisons of Monte Carlo estimates in the reduced dimensionality space enables tractable frequentist inference without relying on restrictive distributional assumptions. The framework is model-agnostic, supports the evaluation of arbitrary input perturbations on any black-box LLM, yields interpretable p-values, supports multiple perturbation testing via controlled error rates, and provides scalar effect sizes for any chosen similarity or distance metric. We demonstrate the effectiveness of DBPA in evaluating perturbation impacts, showing its versatility for perturbation analysis.
Related papers
- Ensemble based approach to quantifying uncertainty of LLM based classifications [1.6231286831423648]
Finetuning the model results in reducing the sensitivity of the model output to the lexical input variations.
A probabilistic method is proposed for estimating the certainties of the predicted classes.
arXiv Detail & Related papers (2025-02-12T18:42:42Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning [53.25336975467293]
We present the first theoretical error decomposition analysis of methods such as perplexity and self-consistency.
Our analysis reveals a fundamental trade-off: perplexity methods suffer from substantial model error due to the absence of a proper consistency function.
We propose Reasoning-Pruning Perplexity Consistency (RPC), which integrates perplexity with self-consistency, and Reasoning Pruning, which eliminates low-probability reasoning paths.
arXiv Detail & Related papers (2025-02-01T18:09:49Z) - Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems.
We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z) - Quantile-constrained Wasserstein projections for robust interpretability
of numerical and machine learning models [18.771531343438227]
The study of black-box models is often based on sensitivity analysis involving a probabilistic structure imposed on the inputs.
Our work aim at unifying the UQ and ML interpretability approaches, by providing relevant and easy-to-use tools for both paradigms.
arXiv Detail & Related papers (2022-09-23T11:58:03Z) - Robust Output Analysis with Monte-Carlo Methodology [0.0]
In predictive modeling with simulation or machine learning, it is critical to accurately assess the quality of estimated values.
We propose a unified output analysis framework for simulation and machine learning outputs through the lens of Monte Carlo sampling.
arXiv Detail & Related papers (2022-07-27T16:21:59Z) - A variational inference framework for inverse problems [0.39373541926236766]
A framework is presented for fitting inverse problem models via variational Bayes approximations.
This methodology guarantees flexibility to statistical model specification for a broad range of applications.
An image processing application and a simulation exercise motivated by biomedical problems reveal the computational advantage offered by variational Bayes.
arXiv Detail & Related papers (2021-03-10T07:37:20Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.