Hybrid Long Document Summarization using C2F-FAR and ChatGPT: A
Practical Study
- URL: http://arxiv.org/abs/2306.01169v1
- Date: Thu, 1 Jun 2023 21:58:33 GMT
- Title: Hybrid Long Document Summarization using C2F-FAR and ChatGPT: A
Practical Study
- Authors: Guang Lu, Sylvia B. Larcher, Tu Tran
- Abstract summary: ChatGPT is the latest breakthrough in the field of large language models (LLMs)
We propose a hybrid extraction and summarization pipeline for long documents such as business articles and books.
Our results show that the use of ChatGPT is a very promising but not yet mature approach for summarizing long documents.
- Score: 1.933681537640272
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Text summarization is a downstream natural language processing (NLP) task
that challenges the understanding and generation capabilities of language
models. Considerable progress has been made in automatically summarizing short
texts, such as news articles, often leading to satisfactory results. However,
summarizing long documents remains a major challenge. This is due to the
complex contextual information in the text and the lack of open-source
benchmarking datasets and evaluation frameworks that can be used to develop and
test model performance. In this work, we use ChatGPT, the latest breakthrough
in the field of large language models (LLMs), together with the extractive
summarization model C2F-FAR (Coarse-to-Fine Facet-Aware Ranking) to propose a
hybrid extraction and summarization pipeline for long documents such as
business articles and books. We work with the world-renowned company
getAbstract AG and leverage their expertise and experience in professional book
summarization. A practical study has shown that machine-generated summaries can
perform at least as well as human-written summaries when evaluated using
current automated evaluation metrics. However, a closer examination of the
texts generated by ChatGPT through human evaluations has shown that there are
still critical issues in terms of text coherence, faithfulness, and style.
Overall, our results show that the use of ChatGPT is a very promising but not
yet mature approach for summarizing long documents and can at best serve as an
inspiration for human editors. We anticipate that our work will inform NLP
researchers about the extent to which ChatGPT's capabilities for summarizing
long documents overlap with practitioners' needs. Further work is needed to
test the proposed hybrid summarization pipeline, in particular involving GPT-4,
and to propose a new evaluation framework tailored to the task of summarizing
long documents.
Related papers
- Integrating Planning into Single-Turn Long-Form Text Generation [66.08871753377055]
We propose to use planning to generate long form content.
Our main novelty lies in a single auxiliary task that does not require multiple rounds of prompting or planning.
Our experiments demonstrate on two datasets from different domains, that LLMs fine-tuned with the auxiliary task generate higher quality documents.
arXiv Detail & Related papers (2024-10-08T17:02:40Z) - Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization [48.57273563299046]
We propose the task of Stepwise Summarization, which aims to generate a new appended summary each time a new document is proposed.
The appended summary should not only summarize the newly added content but also be coherent with the previous summary.
We show that SSG achieves state-of-the-art performance in terms of both automatic metrics and human evaluations.
arXiv Detail & Related papers (2024-06-08T05:37:26Z) - Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z) - Investigating Consistency in Query-Based Meeting Summarization: A
Comparative Study of Different Embedding Methods [0.0]
Text Summarization is one of famous applications in Natural Language Processing (NLP) field.
It aims to automatically generate summary with important information based on a given context.
In this paper, we are inspired by "QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization" proposed by Microsoft.
We also propose our Locater model designed to extract relevant spans based on given transcript and query, which are then summarized by Summarizer model.
arXiv Detail & Related papers (2024-02-10T08:25:30Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - ChatGPT vs State-of-the-Art Models: A Benchmarking Study in Keyphrase
Generation Task [0.0]
Transformer-based language models, including ChatGPT, have demonstrated exceptional performance in various natural language generation tasks.
This study compares ChatGPT's keyphrase generation performance with state-of-the-art models, while also testing its potential as a solution for two significant challenges in the field.
arXiv Detail & Related papers (2023-04-27T13:25:43Z) - Large Language Models are Diverse Role-Players for Summarization
Evaluation [82.31575622685902]
A document summary's quality can be assessed by human annotators on various criteria, both objective ones like grammar and correctness, and subjective ones like informativeness, succinctness, and appeal.
Most of the automatic evaluation methods like BLUE/ROUGE may be not able to adequately capture the above dimensions.
We propose a new evaluation framework based on LLMs, which provides a comprehensive evaluation framework by comparing generated text and reference text from both objective and subjective aspects.
arXiv Detail & Related papers (2023-03-27T10:40:59Z) - Lay Text Summarisation Using Natural Language Processing: A Narrative
Literature Review [1.8899300124593648]
The aim of this literature review is to describe and compare the different text summarisation approaches used to generate lay summaries.
We screened 82 articles and included eight relevant papers published between 2020 and 2021, using the same dataset.
A combination of extractive and abstractive summarisation methods in a hybrid approach was found to be most effective.
arXiv Detail & Related papers (2023-03-24T18:30:50Z) - Exploring the Limits of ChatGPT for Query or Aspect-based Text
Summarization [28.104696513516117]
Large language models (LLMs) like GPT3 and ChatGPT have recently created significant interest in using these models for text summarization tasks.
Recent studies citegoyal2022news, zhang2023benchmarking have shown that LLMs-generated news summaries are already on par with humans.
Our experiments reveal that ChatGPT's performance is comparable to traditional fine-tuning methods in terms of Rouge scores.
arXiv Detail & Related papers (2023-02-16T04:41:30Z) - Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues
and Documents [13.755637074366813]
SummN is a simple, flexible, and effective multi-stage framework for input texts longer than the maximum context lengths of typical pretrained LMs.
It can process input text of arbitrary length by adjusting the number of stages while keeping the LM context size fixed.
Our experiments demonstrate that SummN significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2021-10-16T06:19:54Z) - From Standard Summarization to New Tasks and Beyond: Summarization with
Manifold Information [77.89755281215079]
Text summarization is the research area aiming at creating a short and condensed version of the original document.
In real-world applications, most of the data is not in a plain text format.
This paper focuses on the survey of these new summarization tasks and approaches in the real-world application.
arXiv Detail & Related papers (2020-05-10T14:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.