Related papers: Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure

Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure

URL: http://arxiv.org/abs/2506.08713v2
Date: Thu, 03 Jul 2025 13:39:37 GMT
Title: Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure
Authors: Fariz Ikhwantri, Dusica Marijan,
Abstract summary: We propose a compliance detection approach based on Natural Language Inference (NLI)<n>We formulate the claim-argument-evidence structure of an assurance case as a multi-hop inference for explainable and traceable compliance detection.<n>Our results highlight the potential of NLI-based approaches in automating the regulatory compliance process.
Score: 1.5653612447564105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Ensuring complex systems meet regulations typically requires checking the validity of assurance cases through a claim-argument-evidence framework. Some challenges in this process include the complicated nature of legal and technical texts, the need for model explanations, and limited access to assurance case data. We propose a compliance detection approach based on Natural Language Inference (NLI): EXplainable CompLiance detection with Argumentative Inference of Multi-hop reasoning (EXCLAIM). We formulate the claim-argument-evidence structure of an assurance case as a multi-hop inference for explainable and traceable compliance detection. We address the limited number of assurance cases by generating them using large language models (LLMs). We introduce metrics that measure the coverage and structural consistency. We demonstrate the effectiveness of the generated assurance case from GDPR requirements in a multi-hop inference task as a case study. Our results highlight the potential of NLI-based approaches in automating the regulatory compliance process.

Related papers

What's in a Proof? Analyzing Expert Proof-Writing Processes in F* and Verus [2.8003002159083237]
We conduct a user study involving the collection and analysis of fine-grained source code telemetry from eight experts working with two languages.<n>Results reveal interesting trends and patterns about how experts reason about proofs and key challenges encountered during the proof development process.<n>We translate these findings into concrete design guidance for AI proof assistants.
arXiv Detail & Related papers (2025-08-01T22:16:30Z)
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality [62.72381208029899]
In-context learning (ICL) enables language models to generate responses without modifying their parameters by leveraging examples provided in the input.<n>We propose Federated In-Context Learning (Fed-ICL), a general framework that enhances ICL through an iterative, collaborative process.<n>Fed-ICL progressively refines responses by leveraging multi-round interactions between clients and a central server, improving answer quality without the need to transmit model parameters.
arXiv Detail & Related papers (2025-06-09T05:33:28Z)
CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection [60.98964268961243]
We propose that guiding models to perform a systematic and comprehensive reasoning process allows models to execute much finer-grained and accurate entailment decisions.<n>We define a 3-step reasoning process, consisting of (i) claim decomposition, (ii) sub-claim attribution and entailment classification, and (iii) aggregated classification, showing that such guided reasoning indeed yields improved hallucination detection.
arXiv Detail & Related papers (2025-06-05T17:02:52Z)
VerifyThisBench: Generating Code, Specifications, and Proofs All at Once [5.783301542485619]
We introduce a new benchmark designed to evaluate large language models (LLMs) on end-to-end program verification tasks.<n>Our evaluation reveals that even state-of-the-art (SOTA) models, such as o3-mini, achieve a pass rate of less than 4%, with many outputs failing to compile.
arXiv Detail & Related papers (2025-05-25T19:00:52Z)
Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations [65.11348389219887]
We introduce Dialectic-RAG (DRAG), a modular approach that evaluates retrieved information by comparing, contrasting, and resolving conflicting perspectives.<n>We show the impact of our framework both as an in-context learning strategy and for constructing demonstrations to instruct smaller models.
arXiv Detail & Related papers (2025-04-07T06:55:15Z)
Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning [19.477062052536887]
We propose the Logical-Semantic Integration Model (LSIM), a supervised framework that bridges semantic and logical coherence.<n>LSIM comprises three components: reinforcement learning predicts a structured fact-rule chain for each question, a trainable Deep Structured Semantic Model (DSSM) retrieves the most relevant candidate questions and in-answer learning generates the final answer.<n>Our experiments on a real-world legal dataset QA-validated through both automated metrics and human evaluation-demonstrate that LSIM significantly enhances accuracy and reliability compared to existing methods.
arXiv Detail & Related papers (2025-02-11T19:33:07Z)
Few-shot Policy (de)composition in Conversational Question Answering [54.259440408606515]
We propose a neuro-symbolic framework to detect policy compliance using large language models (LLMs) in a few-shot setting.<n>We show that our approach soundly reasons about policy compliance conversations by extracting sub-questions to be answered, assigning truth values from contextual information, and explicitly producing a set of logic statements from the given policies.<n>We apply this approach to the popular PCD and conversational machine reading benchmark, ShARC, and show competitive performance with no task-specific finetuning.
arXiv Detail & Related papers (2025-01-20T08:40:15Z)
Rethinking State Disentanglement in Causal Reinforcement Learning [78.12976579620165]
Causality provides rigorous theoretical support for ensuring that the underlying states can be uniquely recovered through identifiability. We revisit this research line and find that incorporating RL-specific context can reduce unnecessary assumptions in previous identifiability analyses for latent states. We propose a novel approach for general partially observable Markov Decision Processes (POMDPs) by replacing the complicated structural constraints in previous methods with two simple constraints for transition and reward preservation.
arXiv Detail & Related papers (2024-08-24T06:49:13Z)
From Chaos to Clarity: Claim Normalization to Empower Fact-Checking [57.024192702939736]
Claim Normalization (aka ClaimNorm) aims to decompose complex and noisy social media posts into more straightforward and understandable forms. We propose CACN, a pioneering approach that leverages chain-of-thought and claim check-worthiness estimation. Our experiments demonstrate that CACN outperforms several baselines across various evaluation measures.
arXiv Detail & Related papers (2023-10-22T16:07:06Z)
Trusta: Reasoning about Assurance Cases with Formal Methods and Large Language Models [4.005483185111992]
Trustworthiness Derivation Tree Analyzer (Trusta) is a desktop application designed to automatically construct and verify TDTs. It has a built-in Prolog interpreter in its backend, and is supported by the constraint solvers Z3 and MONA. Trusta can extract formal constraints from text in natural languages, facilitating an easier interpretation and validation process.
arXiv Detail & Related papers (2023-09-22T15:42:43Z)
ADC: Adversarial attacks against object Detection that evade Context consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples. We propose an adaptive framework to generate examples that subvert such defenses. Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z)
Case-Based Abductive Natural Language Inference [4.726777092009554]
Case-Based Abductive Natural Language Inference (CB-ANLI) Case-Based Abductive Natural Language Inference (CB-ANLI)
arXiv Detail & Related papers (2020-09-30T09:50:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.