Prompting and Fine-Tuning of Small LLMs for Length-Controllable Telephone Call Summarization
- URL: http://arxiv.org/abs/2410.18624v1
- Date: Thu, 24 Oct 2024 10:32:10 GMT
- Title: Prompting and Fine-Tuning of Small LLMs for Length-Controllable Telephone Call Summarization
- Authors: David Thulke, Yingbo Gao, Rricha Jalota, Christian Dugast, Hermann Ney,
- Abstract summary: This paper explores the rapid development of a telephone call summarization system utilizing large language models (LLMs)
Our results show that fine-tuned Llama-2-7B-based summarization model performs on-par with GPT-4 in terms of factual accuracy, completeness and conciseness.
- Score: 33.67670065326008
- License:
- Abstract: This paper explores the rapid development of a telephone call summarization system utilizing large language models (LLMs). Our approach involves initial experiments with prompting existing LLMs to generate summaries of telephone conversations, followed by the creation of a tailored synthetic training dataset utilizing stronger frontier models. We place special focus on the diversity of the generated data and on the ability to control the length of the generated summaries to meet various use-case specific requirements. The effectiveness of our method is evaluated using two state-of-the-art LLM-as-a-judge-based evaluation techniques to ensure the quality and relevance of the summaries. Our results show that fine-tuned Llama-2-7B-based summarization model performs on-par with GPT-4 in terms of factual accuracy, completeness and conciseness. Our findings demonstrate the potential for quickly bootstrapping a practical and efficient call summarization system.
Related papers
- Unleashing the Power of Large Language Models in Zero-shot Relation Extraction via Self-Prompting [21.04933334040135]
We introduce the Self-Prompting framework, a novel method designed to fully harness the embedded RE knowledge within Large Language Models.
Our framework employs a three-stage diversity approach to prompt LLMs, generating multiple synthetic samples that encapsulate specific relations from scratch.
Experimental evaluations on benchmark datasets show our approach outperforms existing LLM-based zero-shot RE methods.
arXiv Detail & Related papers (2024-10-02T01:12:54Z) - Information-Theoretic Distillation for Reference-less Summarization [67.51150817011617]
We present a novel framework to distill a powerful summarizer based on the information-theoretic objective for summarization.
We start off from Pythia-2.8B as the teacher model, which is not yet capable of summarization.
We arrive at a compact but powerful summarizer with only 568M parameters that performs competitively against ChatGPT.
arXiv Detail & Related papers (2024-03-20T17:42:08Z) - TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale [66.01943465390548]
We introduce TriSum, a framework for distilling large language models' text summarization abilities into a compact, local model.
Our method enhances local model performance on various benchmarks.
It also improves interpretability by providing insights into the summarization rationale.
arXiv Detail & Related papers (2024-03-15T14:36:38Z) - Benchmarking LLMs on the Semantic Overlap Summarization Task [9.656095701778975]
This paper comprehensively evaluates Large Language Models (LLMs) on the Semantic Overlap Summarization (SOS) task.
We report well-established metrics like ROUGE, BERTscore, and SEM-F1$ on two different datasets of alternative narratives.
arXiv Detail & Related papers (2024-02-26T20:33:50Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - MM-Narrator: Narrating Long-form Videos with Multimodal In-Context
Learning [120.95150400119705]
We present MM-Narrator, a novel system leveraging GPT-4 with multimodal in-context learning for the generation of audio descriptions (AD)
MM-Narrator excels in generating precise audio descriptions for videos of extensive lengths, even beyond hours, in an autoregressive manner.
We introduce the first segment-based evaluator for recurrent text generation.
arXiv Detail & Related papers (2023-11-29T08:27:00Z) - Summarization is (Almost) Dead [49.360752383801305]
We develop new datasets and conduct human evaluation experiments to evaluate the zero-shot generation capability of large language models (LLMs)
Our findings indicate a clear preference among human evaluators for LLM-generated summaries over human-written summaries and summaries generated by fine-tuned models.
arXiv Detail & Related papers (2023-09-18T08:13:01Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - PromptSum: Parameter-Efficient Controllable Abstractive Summarization [4.145362426026615]
We introduce PromptSum, a method combining PT with a multi-task objective and discrete entity prompts for abstractive summarization.
Our model competitive ROUGE results on popular abstractive summarization benchmarks coupled with a strong level of controllability through entities.
arXiv Detail & Related papers (2023-08-06T13:54:14Z) - Evaluating Factual Consistency of Summaries with Large Language Models [24.416837319515896]
We explore evaluating factual consistency of summaries by directly prompting large language models (LLMs)
Our experiments demonstrate that prompting LLMs is able to outperform the previous best factuality systems in all settings.
arXiv Detail & Related papers (2023-05-23T13:48:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.