How Ready are Pre-trained Abstractive Models and LLMs for Legal Case
Judgement Summarization?
- URL: http://arxiv.org/abs/2306.01248v2
- Date: Wed, 14 Jun 2023 07:25:42 GMT
- Title: How Ready are Pre-trained Abstractive Models and LLMs for Legal Case
Judgement Summarization?
- Authors: Aniket Deroy, Kripabandhu Ghosh, Saptarshi Ghosh
- Abstract summary: In recent years, abstractive summarization models are gaining popularity.
Legal domain-specific pre-trained abstractive summarization models are now available.
General-domain pre-trained Large Language Models (LLMs) are known to generate high-quality text.
- Score: 4.721618284417204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic summarization of legal case judgements has traditionally been
attempted by using extractive summarization methods. However, in recent years,
abstractive summarization models are gaining popularity since they can generate
more natural and coherent summaries. Legal domain-specific pre-trained
abstractive summarization models are now available. Moreover, general-domain
pre-trained Large Language Models (LLMs), such as ChatGPT, are known to
generate high-quality text and have the capacity for text summarization. Hence
it is natural to ask if these models are ready for off-the-shelf application to
automatically generate abstractive summaries for case judgements. To explore
this question, we apply several state-of-the-art domain-specific abstractive
summarization models and general-domain LLMs on Indian court case judgements,
and check the quality of the generated summaries. In addition to standard
metrics for summary quality, we check for inconsistencies and hallucinations in
the summaries. We see that abstractive summarization models generally achieve
slightly higher scores than extractive models in terms of standard summary
evaluation metrics such as ROUGE and BLEU. However, we often find inconsistent
or hallucinated information in the generated abstractive summaries. Overall,
our investigation indicates that the pre-trained abstractive summarization
models and LLMs are not yet ready for fully automatic deployment for case
judgement summarization; rather a human-in-the-loop approach including manual
checks for inconsistencies is more suitable at present.
Related papers
- Applicability of Large Language Models and Generative Models for Legal Case Judgement Summarization [5.0645491201288495]
In recent years, generative models including abstractive summarization models and Large language models (LLMs) have gained huge popularity.
In this paper, we explore the applicability of such models for legal case judgement summarization.
arXiv Detail & Related papers (2024-07-06T04:49:40Z) - AugSumm: towards generalizable speech summarization using synthetic
labels from large language model [61.73741195292997]
Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech.
conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary.
We propose AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries.
arXiv Detail & Related papers (2024-01-10T18:39:46Z) - Abstractive Text Summarization Using the BRIO Training Paradigm [2.102846336724103]
This paper presents a technique to improve abstractive summaries by fine-tuning pre-trained language models.
We build a text summarization dataset for Vietnamese, called VieSum.
We perform experiments with abstractive summarization models trained with the BRIO paradigm on the CNNDM and the VieSum datasets.
arXiv Detail & Related papers (2023-05-23T05:09:53Z) - Correcting Diverse Factual Errors in Abstractive Summarization via
Post-Editing and Language Model Infilling [56.70682379371534]
We show that our approach vastly outperforms prior methods in correcting erroneous summaries.
Our model -- FactEdit -- improves factuality scores by over 11 points on CNN/DM and over 31 points on XSum.
arXiv Detail & Related papers (2022-10-22T07:16:19Z) - Abstractive Query Focused Summarization with Query-Free Resources [60.468323530248945]
In this work, we consider the problem of leveraging only generic summarization resources to build an abstractive QFS system.
We propose Marge, a Masked ROUGE Regression framework composed of a novel unified representation for summaries and queries.
Despite learning from minimal supervision, our system achieves state-of-the-art results in the distantly supervised setting.
arXiv Detail & Related papers (2020-12-29T14:39:35Z) - Constrained Abstractive Summarization: Preserving Factual Consistency
with Constrained Generation [93.87095877617968]
We propose Constrained Abstractive Summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization.
We adopt lexically constrained decoding, a technique generally applicable to autoregressive generative models, to fulfill CAS.
We observe up to 13.8 ROUGE-2 gains when only one manual constraint is used in interactive summarization.
arXiv Detail & Related papers (2020-10-24T00:27:44Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - Generating (Factual?) Narrative Summaries of RCTs: Experiments with
Neural Multi-Document Summarization [22.611879349101596]
We evaluate modern neural models for abstractive summarization of relevant article abstracts from systematic reviews.
We find that modern summarization systems yield consistently fluent and relevant synopses, but that they are not always factual.
arXiv Detail & Related papers (2020-08-25T22:22:50Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.