A Comparative Analysis of Conversational Large Language Models in
Knowledge-Based Text Generation
- URL: http://arxiv.org/abs/2402.01495v1
- Date: Fri, 2 Feb 2024 15:26:39 GMT
- Title: A Comparative Analysis of Conversational Large Language Models in
Knowledge-Based Text Generation
- Authors: Phillip Schneider, Manuel Klettner, Elena Simperl, Florian Matthes
- Abstract summary: We conduct an empirical analysis of conversational large language models in generating natural language text from semantic triples.
We compare four large language models of varying sizes with different prompting techniques.
Our findings show that the capabilities of large language models in triple verbalization can be significantly improved through few-shot prompting, post-processing, and efficient fine-tuning techniques.
- Score: 5.661396828160973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating natural language text from graph-structured data is essential for
conversational information seeking. Semantic triples derived from knowledge
graphs can serve as a valuable source for grounding responses from
conversational agents by providing a factual basis for the information they
communicate. This is especially relevant in the context of large language
models, which offer great potential for conversational interaction but are
prone to hallucinating, omitting, or producing conflicting information. In this
study, we conduct an empirical analysis of conversational large language models
in generating natural language text from semantic triples. We compare four
large language models of varying sizes with different prompting techniques.
Through a series of benchmark experiments on the WebNLG dataset, we analyze the
models' performance and identify the most common issues in the generated
predictions. Our findings show that the capabilities of large language models
in triple verbalization can be significantly improved through few-shot
prompting, post-processing, and efficient fine-tuning techniques, particularly
for smaller models that exhibit lower zero-shot performance.
Related papers
- Linguistically Grounded Analysis of Language Models using Shapley Head Values [2.914115079173979]
We investigate the processing of morphosyntactic phenomena by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs)
Using the English language BLiMP dataset, we test our approach on two widely used models, BERT and RoBERTa, and compare how linguistic constructions are handled.
Our results show that SHV-based attributions reveal distinct patterns across both models, providing insights into how language models organize and process linguistic information.
arXiv Detail & Related papers (2024-10-17T09:48:08Z) - Instruction Data Generation and Unsupervised Adaptation for Speech Language Models [21.56355461403427]
We propose three methods for generating synthetic samples to train and evaluate multimodal large language models.
Synthetic data generation emerges as a crucial strategy to enhance the performance of such systems.
We highlight the potential of using unlabeled speech data to generate synthetic samples comparable in quality to those with available transcriptions.
arXiv Detail & Related papers (2024-06-18T08:27:00Z) - Evaluating Large Language Models in Semantic Parsing for Conversational
Question Answering over Knowledge Graphs [6.869834883252353]
This paper evaluates the performance of large language models that have not been explicitly pre-trained on this task.
Our results demonstrate that large language models are capable of generating graph queries from dialogues.
arXiv Detail & Related papers (2024-01-03T12:28:33Z) - Large Language Models as Zero-Shot Conversational Recommenders [52.57230221644014]
We present empirical studies on conversational recommendation tasks using representative large language models in a zero-shot setting.
We construct a new dataset of recommendation-related conversations by scraping a popular discussion website.
We observe that even without fine-tuning, large language models can outperform existing fine-tuned conversational recommendation models.
arXiv Detail & Related papers (2023-08-19T15:29:45Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation [9.501648136713694]
Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts.
This paper proposes a novel data augmentation technique that leverages large-scale language models to generate realistic text samples.
arXiv Detail & Related papers (2021-04-18T11:39:33Z) - Robustness Testing of Language Understanding in Dialog Systems [33.30143655553583]
We conduct comprehensive evaluation and analysis with respect to the robustness of natural language understanding models.
We introduce three important aspects related to language understanding in real-world dialog systems, namely, language variety, speech characteristics, and noise perturbation.
We propose a model-agnostic toolkit LAUG to approximate natural perturbation for testing the robustness issues in dialog systems.
arXiv Detail & Related papers (2020-12-30T18:18:47Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Data Augmentation for Spoken Language Understanding via Pretrained
Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z) - Limits of Detecting Text Generated by Large-Scale Language Models [65.46403462928319]
Some consider large-scale language models that can generate long and coherent pieces of text as dangerous, since they may be used in misinformation campaigns.
Here we formulate large-scale language model output detection as a hypothesis testing problem to classify text as genuine or generated.
arXiv Detail & Related papers (2020-02-09T19:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.