Robust Deep Reinforcement Learning for Extractive Legal Summarization
- URL: http://arxiv.org/abs/2111.07158v1
- Date: Sat, 13 Nov 2021 17:27:49 GMT
- Title: Robust Deep Reinforcement Learning for Extractive Legal Summarization
- Authors: Duy-Hung Nguyen, Bao-Sinh Nguyen, Nguyen Viet Dung Nghiem, Dung Tien
Le, Mim Amina Khatun, Minh-Tien Nguyen, and Hung Le
- Abstract summary: We propose to use reinforcement learning to train current deep summarization models to improve their performance on the legal domain.
We observe a consistent and significant performance gain across 3 public legal datasets.
- Score: 13.577343899514396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic summarization of legal texts is an important and still a
challenging task since legal documents are often long and complicated with
unusual structures and styles. Recent advances of deep models trained
end-to-end with differentiable losses can well-summarize natural text, yet when
applied to legal domain, they show limited results. In this paper, we propose
to use reinforcement learning to train current deep summarization models to
improve their performance on the legal domain. To this end, we adopt proximal
policy optimization methods and introduce novel reward functions that encourage
the generation of candidate summaries satisfying both lexical and semantic
criteria. We apply our method to training different summarization backbones and
observe a consistent and significant performance gain across 3 public legal
datasets.
Related papers
- DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction [85.26780391682894]
We propose Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction (FENICE)
FENICE leverages an NLI-based alignment between information in the source document and a set of atomic facts, referred to as claims, extracted from the summary.
Our metric sets a new state of the art on AGGREFACT, the de-facto benchmark for factuality evaluation.
arXiv Detail & Related papers (2024-03-04T17:57:18Z) - A Deep Learning-Based System for Automatic Case Summarization [2.9141777969894966]
This paper presents a deep learning-based system for efficient automatic case summarization.
The system offers both supervised and unsupervised methods to generate concise and relevant summaries of lengthy legal case documents.
Future work will focus on refining summarization techniques and exploring the application of our methods to other types of legal texts.
arXiv Detail & Related papers (2023-12-13T01:18:10Z) - Enhancing Pre-Trained Language Models with Sentence Position Embeddings
for Rhetorical Roles Recognition in Legal Opinions [0.16385815610837165]
The size of legal opinions continues to grow, making it increasingly challenging to develop a model that can accurately predict the rhetorical roles of legal opinions.
We propose a novel model architecture for automatically predicting rhetorical roles using pre-trained language models (PLMs) enhanced with knowledge of sentence position information.
Based on an annotated corpus from the LegalEval@SemEval2023 competition, we demonstrate that our approach requires fewer parameters, resulting in lower computational costs.
arXiv Detail & Related papers (2023-10-08T20:33:55Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - Extractive Summarization of Legal Decisions using Multi-task Learning
and Maximal Marginal Relevance [3.6847375967256295]
This paper presents techniques for extractive summarization of legal decisions in a low-resource setting using limited expert annotated data.
We test a set of models that locate relevant content using a sequential model and tackle redundancy by leveraging maximal marginal relevance to compose summaries.
Our results show that the proposed approaches can achieve ROUGE scores vis-a-vis expert extracted summaries that match those achieved by inter-annotator comparison.
arXiv Detail & Related papers (2022-10-22T12:51:52Z) - ArgLegalSumm: Improving Abstractive Summarization of Legal Documents
with Argument Mining [0.2538209532048867]
We introduce a technique to capture the argumentative structure of legal documents by integrating argument role labeling into the summarization process.
Experiments with pretrained language models show that our proposed approach improves performance over strong baselines.
arXiv Detail & Related papers (2022-09-04T15:55:56Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.