Domain-specific Language Pre-training for Dialogue Comprehension on
Clinical Inquiry-Answering Conversations
- URL: http://arxiv.org/abs/2206.02428v1
- Date: Mon, 6 Jun 2022 08:45:03 GMT
- Title: Domain-specific Language Pre-training for Dialogue Comprehension on
Clinical Inquiry-Answering Conversations
- Authors: Zhengyuan Liu, Pavitra Krishnaswamy, Nancy F. Chen
- Abstract summary: Recent developments in natural language processing suggest that large-scale pre-trained language backbones could be leveraged for machine comprehension and information extraction tasks.
Yet, due to the gap between pre-training and downstream clinical domains, it remains challenging to exploit the generic backbones for domain-specific applications.
We propose a domain-specific language pre-training, to improve performance on downstream tasks like dialogue comprehension.
- Score: 28.567701055153385
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is growing interest in the automated extraction of relevant information
from clinical dialogues. However, it is difficult to collect and construct
large annotated resources for clinical dialogue tasks. Recent developments in
natural language processing suggest that large-scale pre-trained language
backbones could be leveraged for such machine comprehension and information
extraction tasks. Yet, due to the gap between pre-training and downstream
clinical domains, it remains challenging to exploit the generic backbones for
domain-specific applications. Therefore, in this work, we propose a
domain-specific language pre-training, to improve performance on downstream
tasks like dialogue comprehension. Aside from the common token-level masking
pre-training method, according to the nature of human conversations and
interactive flow of multi-topic inquiry-answering dialogues, we further propose
sample generation strategies with speaker and utterance manipulation. The
conversational pre-training guides the language backbone to reconstruct the
utterances coherently based on the remaining context, thus bridging the gap
between general and specific domains. Experiments are conducted on a clinical
conversation dataset for symptom checking, where nurses inquire and discuss
symptom information with patients. We empirically show that the neural model
with our proposed approach brings improvement in the dialogue comprehension
task, and can achieve favorable results in the low resource training scenario.
Related papers
- Picking the Underused Heads: A Network Pruning Perspective of Attention
Head Selection for Fusing Dialogue Coreference Information [50.41829484199252]
Transformer-based models with the multi-head self-attention mechanism are widely used in natural language processing.
We investigate the attention head selection and manipulation strategy for feature injection from a network pruning perspective.
arXiv Detail & Related papers (2023-12-15T05:27:24Z) - PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue
Model [79.64376762489164]
PK-Chat is a Pointer network guided generative dialogue model, incorporating a unified pretrained language model and a pointer network over knowledge graphs.
The words generated by PK-Chat in the dialogue are derived from the prediction of word lists and the direct prediction of the external knowledge graph knowledge.
Based on the PK-Chat, a dialogue system is built for academic scenarios in the case of geosciences.
arXiv Detail & Related papers (2023-04-02T18:23:13Z) - Improve Retrieval-based Dialogue System via Syntax-Informed Attention [46.79601705850277]
We propose SIA, Syntax-Informed Attention, considering both intra- and inter-sentence syntax information.
We evaluate our method on three widely used benchmarks and experimental results demonstrate the general superiority of our method on dialogue response selection.
arXiv Detail & Related papers (2023-03-12T08:14:16Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - Emotion Recognition in Conversation using Probabilistic Soft Logic [17.62924003652853]
emotion recognition in conversation (ERC) is a sub-field of emotion recognition that focuses on conversations that contain two or more utterances.
We implement our approach in a framework called Probabilistic Soft Logic (PSL), a declarative templating language.
PSL provides functionality for the incorporation of results from neural models into PSL models.
We compare our method with state-of-the-art purely neural ERC systems, and see almost a 20% improvement.
arXiv Detail & Related papers (2022-07-14T23:59:06Z) - Achieving Conversational Goals with Unsupervised Post-hoc Knowledge
Injection [37.15893335147598]
A limitation of current neural dialog models is that they tend to suffer from a lack of specificity and informativeness in generated responses.
We propose a post-hoc knowledge-injection technique where we first retrieve a diverse set of relevant knowledge snippets conditioned on both the dialog history and an initial response from an existing dialog model.
We construct multiple candidate responses, individually injecting each retrieved snippet into the initial response using a gradient-based decoding method, and then select the final response with an unsupervised ranking step.
arXiv Detail & Related papers (2022-03-22T00:42:27Z) - Comparison of Speaker Role Recognition and Speaker Enrollment Protocol
for conversational Clinical Interviews [9.728371067160941]
We train end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric.
Results do not depend on the demographics of the Interviewee, highlighting the clinical relevance of our methods.
arXiv Detail & Related papers (2020-10-30T09:07:37Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z) - Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue
Representation Learning [50.5572111079898]
Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc.
While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive.
In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks.
arXiv Detail & Related papers (2020-02-27T04:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.