Related papers: Optimizing Decomposition for Optimal Claim Verification

Optimizing Decomposition for Optimal Claim Verification

URL: http://arxiv.org/abs/2503.15354v2
Date: Sun, 25 May 2025 22:54:47 GMT
Title: Optimizing Decomposition for Optimal Claim Verification
Authors: Yining Lu, Noah Ziems, Hy Dang, Meng Jiang,
Abstract summary: We find that existing decomposition policies, typically hand-crafted demonstrations, do not align well with downstream verifiers in terms of atomicity.<n>We propose dynamic decomposition, a reinforcement learning framework that leverages verifier feedback to learn a policy for dynamically decomposing claims to verifier-preferred atomicity.<n> Experimental results show that dynamic decomposition outperforms existing decomposition policies, improving verification confidence by 0.07 and accuracy by 0.12 on average across varying verifiers, datasets, and atomcities of input claims.
Score: 15.68967195914405
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current research on the \textit{Decompose-Then-Verify} paradigm for evaluating the factuality of long-form text typically treats decomposition and verification in isolation, overlooking their interactions and potential misalignment. We find that existing decomposition policies, typically hand-crafted demonstrations, do not align well with downstream verifiers in terms of atomicity -- a novel metric quantifying information density -- leading to suboptimal verification results. We formulate finding the optimal decomposition policy for optimal verification as a bilevel optimization problem. To approximate a solution for this strongly NP-hard problem, we propose dynamic decomposition, a reinforcement learning framework that leverages verifier feedback to learn a policy for dynamically decomposing claims to verifier-preferred atomicity. Experimental results show that dynamic decomposition outperforms existing decomposition policies, improving verification confidence by 0.07 and accuracy by 0.12 (on a 0-1 scale) on average across varying verifiers, datasets, and atomcities of input claims.

Related papers

Distill and Align Decomposition for Enhanced Claim Verification [51.93960785128124]
Complex claim verification requires decomposing sentences into verifiable subclaims.<n>We propose a reinforcement learning approach that optimises decomposition quality and verifier alignment.<n>Our framework enables smaller language models to achieve state-of-the-art claim verification.
arXiv Detail & Related papers (2026-02-25T12:32:04Z)
How Sampling Shapes LLM Alignment: From One-Shot Optima to Iterative Dynamics [65.67654005892469]
We show that proper instance-dependent sampling can yield stronger ranking guarantees, while skewed on-policy sampling can induce excessive concentration under structured preferences.<n>We then analyze iterative alignment dynamics in which the learned policy feeds back into future sampling and reference policies.<n>Our theoretical insights extend to Direct Preference Optimization, indicating the phenomena we captured are common to a broader class of preference-alignment methods.
arXiv Detail & Related papers (2026-02-12T17:11:08Z)
Outlier-aware Tensor Robust Principal Component Analysis with Self-guided Data Augmentation [21.981038455329013]
We propose a self-guided data augmentation approach that employs adaptive weighting to suppress outlier influence. We show the improvements in both accuracy and computational efficiency compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-25T13:03:35Z)
When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits [5.443263983810103]
As users interact with claims online, they often introduce edits, and it remains unclear whether current embedding models are robust to such edits.<n>We introduce a perturbation framework that generates valid and natural claim variations, enabling us to assess the robustness of a wide-range of sentence embedding models.<n>Our evaluation reveals that standard embedding models exhibit notable performance drops on edited claims, while LLM-distilled embedding models offer improved robustness at a higher computational cost.
arXiv Detail & Related papers (2025-03-05T11:47:32Z)
In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z)
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.<n>To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.<n>Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z)
Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. We propose a single framework built on their equivalence in learning scenarios. Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z)
Search for Concepts: Discovering Visual Concepts Using Direct Optimization [48.51514897866221]
We show that using direct optimization is more generalizable, misses fewer correct decompositions, and typically requires less data than methods based on amortized inference. This highlights a weakness of the current prevalent practice of using amortized inference that can potentially be improved by integrating more direct optimization elements.
arXiv Detail & Related papers (2022-10-25T15:55:24Z)
Globally Convergent Policy Search over Dynamic Filters for Output Estimation [64.90951294952094]
We introduce the first direct policy search algorithm convex which converges to the globally optimal $textitdynamic$ filter. We show that informativity overcomes the aforementioned degeneracy.
arXiv Detail & Related papers (2022-02-23T18:06:20Z)
Extension of Dynamic Mode Decomposition for dynamic systems with incomplete information based on t-model of optimal prediction [69.81996031777717]
The Dynamic Mode Decomposition has proved to be a very efficient technique to study dynamic data. The application of this approach becomes problematic if the available data is incomplete because some dimensions of smaller scale either missing or unmeasured. We consider a first-order approximation of the Mori-Zwanzig decomposition, state the corresponding optimization problem and solve it with the gradient-based optimization method.
arXiv Detail & Related papers (2022-02-23T11:23:59Z)
Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD. We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z)
Deviance Matrix Factorization [6.509665408765348]
We investigate a general matrix factorization for deviance-based data losses, extending the ubiquitous singular value decomposition beyond squared error loss. Our method leverages classical statistical methodology from generalized linear models (GLMs) and provides an efficient algorithm that is flexible enough to allow for structural zeros via entry weights.
arXiv Detail & Related papers (2021-10-12T01:27:55Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Optimally adaptive Bayesian spectral density estimation for stationary and nonstationary processes [0.0]
This article improves on existing methods to estimate the spectral density of stationary and nonstationary time series assuming a Gaussian process prior. By optimising an appropriate eigendecomposition, our method more appropriately models data with both simple and complex periodic structure.
arXiv Detail & Related papers (2020-03-04T23:35:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.