Related papers: Rewriting Conversational Utterances with Instructed Large Language Models

Rewriting Conversational Utterances with Instructed Large Language Models

URL: http://arxiv.org/abs/2410.07797v1
Date: Thu, 10 Oct 2024 10:30:28 GMT
Title: Rewriting Conversational Utterances with Instructed Large Language Models
Authors: Elnara Galimzhanova, Cristina Ioana Muntean, Franco Maria Nardini, Raffaele Perego, Guido Rocchietti,
Abstract summary: Large language models (LLMs) can achieve state-of-the-art performance on many NLP tasks. We study which prompts provide the most informative utterances that lead to the best retrieval performance. The results show that rewriting conversational utterances with instructed LLMs achieves significant improvements of up to 25.2% in MRR, 31.7% in Precision@1, 27% in NDCG@3, and 11.5% in Recall@500 over state-of-the-art techniques.
Score: 9.38751103209178
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many recent studies have shown the ability of large language models (LLMs) to achieve state-of-the-art performance on many NLP tasks, such as question answering, text summarization, coding, and translation. In some cases, the results provided by LLMs are on par with those of human experts. These models' most disruptive innovation is their ability to perform tasks via zero-shot or few-shot prompting. This capability has been successfully exploited to train instructed LLMs, where reinforcement learning with human feedback is used to guide the model to follow the user's requests directly. In this paper, we investigate the ability of instructed LLMs to improve conversational search effectiveness by rewriting user questions in a conversational setting. We study which prompts provide the most informative rewritten utterances that lead to the best retrieval performance. Reproducible experiments are conducted on publicly-available TREC CAST datasets. The results show that rewriting conversational utterances with instructed LLMs achieves significant improvements of up to 25.2% in MRR, 31.7% in Precision@1, 27% in NDCG@3, and 11.5% in Recall@500 over state-of-the-art techniques.

Related papers

A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks [0.0]
Large language models (LLMs) have shown remarkable performance on many different Natural Language Processing (NLP) tasks. Prompt engineering plays a key role in adding more to the already existing abilities of LLMs to achieve significant performance gains. This paper summarizes different prompting techniques and club them together based on different NLP tasks that they have been used for.
arXiv Detail & Related papers (2024-07-17T20:23:19Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions [63.307317584926146]
Large language models (LLMs) adapted to follow user instructions are now widely deployed as conversational agents. In this work, we examine one increasingly common instruction-following task: providing writing assistance to compose a long-form answer. We construct KIWI, a dataset of knowledge-intensive writing instructions in the scientific domain.
arXiv Detail & Related papers (2024-03-06T17:16:44Z)
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation [128.01050030936028]
We propose an information refinement training method named InFO-RAG. InFO-RAG is low-cost and general across various tasks. It improves the performance of LLaMA2 by an average of 9.39% relative points.
arXiv Detail & Related papers (2024-02-28T08:24:38Z)
Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks. LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z)
Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools. Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions. Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z)
Contextual Biasing of Named-Entities with Large Language Models [12.396054621526643]
This paper studies contextual biasing with Large Language Models (LLMs) During second-pass rescoring additional contextual information is provided to a LLM to boost Automatic Speech Recognition (ASR) performance. We propose to leverage prompts for a LLM without fine tuning during rescoring which incorporate a biasing list and few-shot examples.
arXiv Detail & Related papers (2023-09-01T20:15:48Z)
Language Model Self-improvement by Reinforcement Learning Contemplation [13.152789365858812]
This paper introduces a novel unsupervised method called LanguageModel Self-Improvement by Reinforcement Learning Contemplation (SIRLC) As a student, the model generates answers to unlabeled questions, while as a teacher, it evaluates the generated text and assigns scores accordingly. We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.
arXiv Detail & Related papers (2023-05-23T19:25:52Z)
Benchmarking Large Language Models for News Summarization [79.37850439866938]
Large language models (LLMs) have shown promise for automatic summarization but the reasons behind their successes are poorly understood. We find instruction tuning, and not model size, is the key to the LLM's zero-shot summarization capability.
arXiv Detail & Related papers (2023-01-31T18:46:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.