Leveraging pre-trained language models for conversational information
seeking from text
- URL: http://arxiv.org/abs/2204.03542v1
- Date: Thu, 31 Mar 2022 09:00:46 GMT
- Title: Leveraging pre-trained language models for conversational information
seeking from text
- Authors: Patrizio Bellan, Mauro Dragoni and Chiara Ghidini
- Abstract summary: In this paper we investigate the usage of in-context learning and pre-trained language representation models to address the problem of information extraction from process description documents.
The results highlight the potential of the approach and the usefulness of the in-context learning customizations.
- Score: 2.8425118603312
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in Natural Language Processing, and in particular on the
construction of very large pre-trained language representation models, is
opening up new perspectives on the construction of conversational information
seeking (CIS) systems. In this paper we investigate the usage of in-context
learning and pre-trained language representation models to address the problem
of information extraction from process description documents, in an incremental
question and answering oriented fashion. In particular we investigate the usage
of the native GPT-3 (Generative Pre-trained Transformer 3) model, together with
two in-context learning customizations that inject conceptual definitions and a
limited number of samples in a few shot-learning fashion. The results highlight
the potential of the approach and the usefulness of the in-context learning
customizations, which can substantially contribute to address the "training
data challenge" of deep learning based NLP techniques the BPM field. It also
highlight the challenge posed by control flow relations for which further
training needs to be devised.
Related papers
- Procedural Text Mining with Large Language Models [0.21756081703275998]
We tackle the problem of extracting procedures from unstructured PDF text in an incremental question-answering fashion.
We leverage the current state-of-the-art GPT-4 (Generative Pre-trained Transformer 4) model, accompanied by two variations of in-context learning.
The findings highlight both the promise of this approach and the value of the in-context learning customisations.
arXiv Detail & Related papers (2023-10-05T08:27:33Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - Pre-Training to Learn in Context [138.0745138788142]
The ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context.
We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability.
Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters.
arXiv Detail & Related papers (2023-05-16T03:38:06Z) - Leveraging Natural Supervision for Language Representation Learning and
Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks.
We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z) - A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models [185.08295787309544]
We aim to summarize the current progress of pre-trained language model-based knowledge-enhanced models (PLMKEs)
We present the challenges of PLMKEs based on the discussion regarding the three elements and attempt to provide NLP practitioners with potential directions for further research.
arXiv Detail & Related papers (2022-02-17T17:17:43Z) - A Survey of Knowledge Enhanced Pre-trained Models [28.160826399552462]
We refer to pre-trained language models with knowledge injection as knowledge-enhanced pre-trained language models (KEPLMs)
These models demonstrate deep understanding and logical reasoning and introduce interpretability.
arXiv Detail & Related papers (2021-10-01T08:51:58Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - A Survey on Recent Approaches for Natural Language Processing in
Low-Resource Scenarios [30.391291221959545]
Deep neural networks and huge language models are becoming omnipresent in natural language applications.
As they are known for requiring large amounts of training data, there is a growing body of work to improve the performance in low-resource settings.
Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing.
arXiv Detail & Related papers (2020-10-23T11:22:01Z) - Data Augmentation for Spoken Language Understanding via Pretrained
Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.