Related papers: Building Real-World Meeting Summarization Systems using Large Language Models: A Practical Perspective

Building Real-World Meeting Summarization Systems using Large Language Models: A Practical Perspective

URL: http://arxiv.org/abs/2310.19233v3
Date: Wed, 8 Nov 2023 02:23:28 GMT
Title: Building Real-World Meeting Summarization Systems using Large Language Models: A Practical Perspective
Authors: Md Tahmid Rahman Laskar, Xue-Yong Fu, Cheng Chen, Shashi Bhushan TN
Abstract summary: This paper studies how to effectively build meeting summarization systems for real-world usage using large language models (LLMs) Our findings reveal that most closed-source LLMs are generally better in terms of performance. Much smaller open-source models like LLaMA- 2 (7B and 13B) could still achieve performance comparable to the large closed-source models even in zero-shot scenarios.
Score: 8.526956860672698
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper studies how to effectively build meeting summarization systems for real-world usage using large language models (LLMs). For this purpose, we conduct an extensive evaluation and comparison of various closed-source and open-source LLMs, namely, GPT-4, GPT- 3.5, PaLM-2, and LLaMA-2. Our findings reveal that most closed-source LLMs are generally better in terms of performance. However, much smaller open-source models like LLaMA- 2 (7B and 13B) could still achieve performance comparable to the large closed-source models even in zero-shot scenarios. Considering the privacy concerns of closed-source models for only being accessible via API, alongside the high cost associated with using fine-tuned versions of the closed-source models, the opensource models that can achieve competitive performance are more advantageous for industrial use. Balancing performance with associated costs and privacy concerns, the LLaMA-2-7B model looks more promising for industrial usage. In sum, this paper offers practical insights on using LLMs for real-world business meeting summarization, shedding light on the trade-offs between performance and cost.

Related papers

DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models [10.34623505096336]
We present DistilQwen2.5, a family of distilled, lightweight language models (LLMs) derived from the public Qwen2.5 models. These models exhibit enhanced instruction-following capabilities compared to the original models. To facilitate practical use, we have released all the DistilQwen2.5 models to the open-source community.
arXiv Detail & Related papers (2025-04-21T11:26:02Z)
COSMosFL: Ensemble of Small Language Models for Fault Localisation [11.720815956899116]
We present COSMos, a task-level LLM ensemble technique that uses voting mechanism. We report the cost-benefit trade-off between LLM accuracy and various costs such as energy consumption, inference time, and the number of tokens used.
arXiv Detail & Related papers (2025-02-05T06:09:26Z)
Preference Leakage: A Contamination Problem in LLM-as-a-judge [69.96778498636071]
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods. In this work, we expose preference leakage, a contamination problem in LLM-as-a-judge caused by the relatedness between the synthetic data generators and LLM-based evaluators.
arXiv Detail & Related papers (2025-02-03T17:13:03Z)
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization [65.64108848398696]
We introduce a preference optimization process to enhance the multimodal reasoning capabilities of MLLMs. We develop a simple yet effective method, termed Mixed Preference Optimization (MPO), which boosts multimodal CoT performance. Our model, InternVL2-8B-MPO, achieves an accuracy of 67.0 on MathVista, outperforming InternVL2-8B by 8.7 points and achieving performance comparable to the 10x larger InternVL2-76B.
arXiv Detail & Related papers (2024-11-15T18:59:27Z)
Rethinking Scale: The Efficacy of Fine-Tuned Open-Source LLMs in Large-Scale Reproducible Social Science Research [0.0]
Large Language Models (LLMs) are distinguished by their architecture, which dictates their parameter size and performance capabilities. Social scientists have increasingly adopted LLMs for text classification tasks, which are difficult to scale with human coders. This study demonstrates that small, fine-tuned open-source LLMs can achieve equal or superior performance to models such as ChatGPT-4.
arXiv Detail & Related papers (2024-10-31T20:26:30Z)
EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs [6.179084469089114]
This paper presents EasyJudge, a model developed to evaluate significant language model responses. It is lightweight, precise, efficient, and user-friendly, featuring an intuitive visualization interface for ease of deployment and use.
arXiv Detail & Related papers (2024-10-13T08:24:12Z)
HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling [21.495443162191332]
Large Language Models (LLMs) have achieved remarkable success in various fields, prompting several studies to explore their potential in recommendation systems. We propose a novel Hierarchical Large Language Model (HLLM) architecture designed to enhance sequential recommendation systems. HLLM achieves excellent scalability, with the largest configuration utilizing 7B parameters for both item feature extraction and user interest modeling.
arXiv Detail & Related papers (2024-09-19T13:03:07Z)
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series [86.31735321970481]
We open-source MAP-Neo, a bilingual language model with 7B parameters trained from scratch on 4.5T high-quality tokens. Our MAP-Neo is the first fully open-sourced bilingual LLM with comparable performance compared to existing state-of-the-art LLMs.
arXiv Detail & Related papers (2024-05-29T17:57:16Z)
ORLM: A Customizable Framework in Training Large Models for Automated Optimization Modeling [15.67321902882617]
We propose a viable path for training open-source LLMs capable of optimization modeling and developing solver codes. This work also introduces IndustryOR, the first industrial benchmark for evaluating LLMs in solving practical OR problems.
arXiv Detail & Related papers (2024-05-28T01:55:35Z)
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z)
Tiny Titans: Can Smaller Large Language Models Punch Above Their Weight in the Real World for Meeting Summarization? [7.674972936853123]
Large Language Models (LLMs) have demonstrated impressive capabilities to solve a wide range of tasks without being explicitly fine-tuned on task-specific datasets. We investigate whether smaller, compact LLMs are a good alternative to the comparatively Larger LLMs2 to address significant costs associated with utilizing LLMs in the real world.
arXiv Detail & Related papers (2024-02-01T18:31:34Z)
Knowledge Fusion of Large Language Models [73.28202188100646]
This paper introduces the notion of knowledge fusion for large language models (LLMs) We externalize their collective knowledge and unique strengths, thereby elevating the capabilities of the target model beyond those of any individual source LLM. Our findings confirm that the fusion of LLMs can improve the performance of the target model across a range of capabilities such as reasoning, commonsense, and code generation.
arXiv Detail & Related papers (2024-01-19T05:02:46Z)
Cache me if you Can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models [13.799197575126442]
Small and medium-sized enterprises (SMEs) cannot afford the cost of creating large task-specific training datasets. Third-party services that allow them to prompt Large Language Models currently require a payment per call. We propose a framework that allows reducing the calls to LLMs by caching previous responses and using them to train a local inexpensive model.
arXiv Detail & Related papers (2023-10-20T10:05:07Z)
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution. We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios. We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z)
On Learning to Summarize with Large Language Models as References [101.79795027550959]
Large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. We study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved.
arXiv Detail & Related papers (2023-05-23T16:56:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.