Logical Reasoning for Task Oriented Dialogue Systems
- URL: http://arxiv.org/abs/2202.04161v1
- Date: Tue, 8 Feb 2022 21:46:27 GMT
- Title: Logical Reasoning for Task Oriented Dialogue Systems
- Authors: Sajjad Beygi, Maryam Fazel-Zarandi, Alessandra Cervone, Prakash
Krishnan, Siddhartha Reddy Jonnalagadda
- Abstract summary: We propose a novel method to fine-tune transformer models such as Roberta and T5 to reason over a set of facts in a given dialogue context.
Our method includes a synthetic data generation mechanism which helps the model learn logical relations.
We show that the transformer based model can perform logical reasoning to answer questions when the dialogue context contains all the required information.
- Score: 57.440956636333325
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, large pretrained models have been used in dialogue systems
to improve successful task completion rates. However, lack of reasoning
capabilities of dialogue platforms make it difficult to provide relevant and
fluent responses, unless the designers of a conversational experience spend a
considerable amount of time implementing these capabilities in external rule
based modules. In this work, we propose a novel method to fine-tune pretrained
transformer models such as Roberta and T5. to reason over a set of facts in a
given dialogue context. Our method includes a synthetic data generation
mechanism which helps the model learn logical relations, such as comparison
between list of numerical values, inverse relations (and negation), inclusion
and exclusion for categorical attributes, and application of a combination of
attributes over both numerical and categorical values, and spoken form for
numerical values, without need for additional training dataset. We show that
the transformer based model can perform logical reasoning to answer questions
when the dialogue context contains all the required information, otherwise it
is able to extract appropriate constraints to pass to downstream components
(e.g. a knowledge base) when partial information is available. We observe that
transformer based models such as UnifiedQA-T5 can be fine-tuned to perform
logical reasoning (such as numerical and categorical attributes' comparison)
over attributes that been seen in training time (e.g., accuracy of 90\%+ for
comparison of smaller than $k_{\max}$=5 values over heldout test dataset).
Related papers
- Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog
Evaluation [75.60156479374416]
CGoDial is a new challenging and comprehensive Chinese benchmark for Goal-oriented Dialog evaluation.
It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources.
To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing.
arXiv Detail & Related papers (2022-11-21T16:21:41Z) - Controllable Dialogue Simulation with In-Context Learning [39.04491297557292]
textscDialogic is a dialogue simulation method based on large language model in-context learning.
Our method can rapidly expand a small set of dialogue data with minimum or zero human involvement.
Our simulated dialogues have near-human fluency and annotation accuracy.
arXiv Detail & Related papers (2022-10-09T06:32:58Z) - Prompting for a conversation: How to control a dialog model? [9.268682116424518]
Dialog models are trained on a large amount of text, yet their responses need to be limited to a desired scope and style of a dialog agent.
Because the datasets used to achieve the former contain language that is not compatible with the latter, pre-trained dialog models are fine-tuned on smaller curated datasets.
In this paper we investigate if prompting can mitigate the above trade-off.
arXiv Detail & Related papers (2022-09-22T14:59:55Z) - What Can Transformers Learn In-Context? A Case Study of Simple Function
Classes [67.06980111346245]
In-context learning refers to the ability of a model to condition on a prompt sequence consisting of in-context examples.
We show that standard Transformers can be trained from scratch to perform in-context learning of linear functions.
We also show that we can train Transformers to in-context learn more complex function classes with performance that matches or exceeds task-specific learning algorithms.
arXiv Detail & Related papers (2022-08-01T18:01:40Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Transformer Models for Text Coherence Assessment [14.132559978971377]
Coherence is an important aspect of text quality and is crucial for ensuring its readability.
Previous work has leveraged entity-based methods, syntactic patterns, discourse relations, and more recently traditional deep learning architectures for text coherence assessment.
We propose four different Transformer-based architectures for the task: vanilla Transformer, hierarchical Transformer, multi-task learning-based model, and a model with fact-based input representation.
arXiv Detail & Related papers (2021-09-05T22:27:17Z) - Turning Tables: Generating Examples from Semi-structured Tables for
Endowing Language Models with Reasoning Skills [32.55545292360155]
We propose to leverage semi-structured tables, and automatically generate at scale question-paragraph pairs.
We add a pre-training step over this synthetic data, which includes examples that require 16 different reasoning skills.
We show that our model, PReasM, substantially outperforms T5, a popular pre-trained encoder-decoder model.
arXiv Detail & Related papers (2021-07-15T11:37:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.