Few-shot Policy (de)composition in Conversational Question Answering
- URL: http://arxiv.org/abs/2501.11335v1
- Date: Mon, 20 Jan 2025 08:40:15 GMT
- Title: Few-shot Policy (de)composition in Conversational Question Answering
- Authors: Kyle Erwin, Guy Axelrod, Maria Chang, Achille Fokoue, Maxwell Crouse, Soham Dan, Tian Gao, Rosario Uceda-Sosa, Ndivhuwo Makondo, Naweed Khan, Alexander Gray,
- Abstract summary: We propose a neuro-symbolic framework to detect policy compliance using large language models (LLMs) in a few-shot setting.
We show that our approach soundly reasons about policy compliance conversations by extracting sub-questions to be answered, assigning truth values from contextual information, and explicitly producing a set of logic statements from the given policies.
We apply this approach to the popular PCD and conversational machine reading benchmark, ShARC, and show competitive performance with no task-specific finetuning.
- Score: 54.259440408606515
- License:
- Abstract: The task of policy compliance detection (PCD) is to determine if a scenario is in compliance with respect to a set of written policies. In a conversational setting, the results of PCD can indicate if clarifying questions must be asked to determine compliance status. Existing approaches usually claim to have reasoning capabilities that are latent or require a large amount of annotated data. In this work, we propose logical decomposition for policy compliance (LDPC): a neuro-symbolic framework to detect policy compliance using large language models (LLMs) in a few-shot setting. By selecting only a few exemplars alongside recently developed prompting techniques, we demonstrate that our approach soundly reasons about policy compliance conversations by extracting sub-questions to be answered, assigning truth values from contextual information, and explicitly producing a set of logic statements from the given policies. The formulation of explicit logic graphs can in turn help answer PCDrelated questions with increased transparency and explainability. We apply this approach to the popular PCD and conversational machine reading benchmark, ShARC, and show competitive performance with no task-specific finetuning. We also leverage the inherently interpretable architecture of LDPC to understand where errors occur, revealing ambiguities in the ShARC dataset and highlighting the challenges involved with reasoning for conversational question answering.
Related papers
- Prompting Strategies for Enabling Large Language Models to Infer Causation from Correlation [68.58373854950294]
We focus on causal reasoning and address the task of establishing causal relationships based on correlation information.
We introduce a prompting strategy for this problem that breaks the original task into fixed subquestions.
We evaluate our approach on an existing causal benchmark, Corr2Cause.
arXiv Detail & Related papers (2024-12-18T15:32:27Z) - On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
We investigate the loss of context awareness after supervised fine-tuning.
We find that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning.
We propose a metric to identify context-dependent examples from general instruction fine-tuning datasets.
arXiv Detail & Related papers (2024-11-05T00:16:01Z) - QUDSELECT: Selective Decoding for Questions Under Discussion Parsing [90.92351108691014]
Question Under Discussion (QUD) is a discourse framework that uses implicit questions to reveal discourse relationships between sentences.
We introduce QUDSELECT, a joint-training framework that selectively decodes the QUD dependency structures considering the QUD criteria.
Our method outperforms the state-of-the-art baseline models by 9% in human evaluation and 4% in automatic evaluation.
arXiv Detail & Related papers (2024-08-02T06:46:08Z) - Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training [33.57497419019826]
Action-Based Contrastive Self-Training allows for sample-efficient dialogue policy learning in multi-turn conversation.
ACT demonstrates substantial conversation modeling improvements over standard approaches to supervised fine-tuning and DPO.
arXiv Detail & Related papers (2024-05-31T22:44:48Z) - Controlled Query Evaluation through Epistemic Dependencies [7.502796412126707]
We show the expressive abilities of our framework and study the data complexity of CQE for (unions of) conjunctive queries.
We prove tractability for the case of acyclic dependencies by providing a suitable query algorithm.
arXiv Detail & Related papers (2024-05-03T19:48:07Z) - Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs [58.620269228776294]
We propose a task-agnostic framework for resolving ambiguity by asking users clarifying questions.
We evaluate systems across three NLP applications: question answering, machine translation and natural language inference.
We find that intent-sim is robust, demonstrating improvements across a wide range of NLP tasks and LMs.
arXiv Detail & Related papers (2023-11-16T00:18:50Z) - Cross-Policy Compliance Detection via Question Answering [13.373804837863155]
We propose to address policy compliance detection via decomposing it into question answering.
We demonstrate that this approach results in better accuracy, especially in the cross-policy setup.
It explicitly identifies the information missing from a scenario in case policy compliance cannot be determined.
arXiv Detail & Related papers (2021-09-08T15:47:41Z) - Interrogating the Black Box: Transparency through Information-Seeking
Dialogues [9.281671380673306]
We propose to construct an investigator agent to query a learning agent to investigate its adherence to an ethical policy.
This formal dialogue framework is the main contribution of this paper.
We argue that the introduced formal dialogue framework opens many avenues both in the area of compliance checking and in the analysis of properties of opaque systems.
arXiv Detail & Related papers (2021-02-09T09:14:04Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z) - Fast Compliance Checking with General Vocabularies [0.0]
We introduce an OWL2 profile for representing data protection policies.
With this language, a company's data usage policy can be checked for compliance with data subjects' consent.
We exploit IBQ reasoning to integrate specialized reasoners for the policy language and the vocabulary's language.
arXiv Detail & Related papers (2020-01-16T09:08:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.