Transformer-based Approaches for Legal Text Processing
- URL: http://arxiv.org/abs/2202.06397v1
- Date: Sun, 13 Feb 2022 19:59:15 GMT
- Title: Transformer-based Approaches for Legal Text Processing
- Authors: Ha-Thanh Nguyen, Minh-Phuong Nguyen, Thi-Hai-Yen Vuong, Minh-Quan Bui,
Minh-Chau Nguyen, Tran-Binh Dang, Vu Tran, Le-Minh Nguyen, Ken Satoh
- Abstract summary: We introduce our approaches using Transformer-based models for different problems of the COLIEE 2021 automatic legal text processing competition.
We find that Transformer-based pretrained language models can perform well with automated legal text processing problems with appropriate approaches.
- Score: 3.4630926944621643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce our approaches using Transformer-based models for
different problems of the COLIEE 2021 automatic legal text processing
competition. Automated processing of legal documents is a challenging task
because of the characteristics of legal documents as well as the limitation of
the amount of data. With our detailed experiments, we found that
Transformer-based pretrained language models can perform well with automated
legal text processing problems with appropriate approaches. We describe in
detail the processing steps for each task such as problem formulation, data
processing and augmentation, pretraining, finetuning. In addition, we introduce
to the community two pretrained models that take advantage of parallel
translations in legal domain, NFSP and NMSP. In which, NFSP achieves the
state-of-the-art result in Task 5 of the competition. Although the paper
focuses on technical reporting, the novelty of its approaches can also be an
useful reference in automated legal document processing using Transformer-based
models.
Related papers
- Improving Legal Entity Recognition Using a Hybrid Transformer Model and Semantic Filtering Approach [0.0]
This paper proposes a novel hybrid model that enhances the accuracy and precision of Legal-BERT, a transformer model fine-tuned for legal text processing.
We evaluate the model on a dataset of 15,000 annotated legal documents, achieving an F1 score of 93.4%, demonstrating significant improvements in precision and recall over previous methods.
arXiv Detail & Related papers (2024-10-11T04:51:28Z) - Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation [49.36436704082436]
How-to questions are integral to decision-making processes and require dynamic, step-by-step answers.
We propose Thread, a novel data organization paradigm aimed at enabling current systems to handle how-to questions more effectively.
arXiv Detail & Related papers (2024-06-19T09:14:41Z) - Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning [0.0]
We propose a fine-tuning frame-work that leverages.
Efficient Fine-Tuning (PEFT) techniques.
We demonstrate that the proposed fine-tuning framework has the potential to improve code-text retrieval performance by tuning only 0.4% parameters at most.
arXiv Detail & Related papers (2024-05-07T08:50:25Z) - Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data [0.01499944454332829]
This paper explores the performance of Transformers and Generative Large Language Models (LLM)
We show the need for training robust models capable of dealing with real-world noisy transactional data.
Our results suggest that a well fine-tuned Transformer model using early-stopping significantly outperforms other approaches.
arXiv Detail & Related papers (2024-04-08T16:04:26Z) - Extensive Evaluation of Transformer-based Architectures for Adverse Drug
Events Extraction [6.78974856327994]
Adverse Event (ADE) extraction is one of the core tasks in digital pharmacovigilance.
We evaluate 19 Transformer-based models for ADE extraction on informal texts.
At the end of our analyses, we identify a list of take-home messages that can be derived from the experimental data.
arXiv Detail & Related papers (2023-06-08T15:25:24Z) - Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order.
In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z) - Zero-Shot Text Matching for Automated Auditing using Sentence
Transformers [0.3078691410268859]
We study the efficiency of unsupervised text matching using Sentence-Bert, a transformer-based model, by applying it to the semantic similarity of financial passages.
Experimental results show that this model is robust to documents from in- and out-of-domain data.
arXiv Detail & Related papers (2022-10-28T11:52:16Z) - An Empirical Study of Automatic Post-Editing [56.86393786396992]
APE aims to reduce manual post-editing efforts by automatically correcting errors in machine-translated output.
To alleviate the lack of genuine training data, most of the current APE systems employ data augmentation methods to generate large-scale artificial corpora.
We study the outputs of the state-of-art APE model on a difficult APE dataset to analyze the problems in existing APE systems.
arXiv Detail & Related papers (2022-09-16T07:38:27Z) - CoCoMoT: Conformance Checking of Multi-Perspective Processes via SMT
(Extended Version) [62.96267257163426]
We introduce the CoCoMoT (Computing Conformance Modulo Theories) framework.
First, we show how SAT-based encodings studied in the pure control-flow setting can be lifted to our data-aware case.
Second, we introduce a novel preprocessing technique based on a notion of property-preserving clustering.
arXiv Detail & Related papers (2021-03-18T20:22:50Z) - Pretrained Transformers for Text Ranking: BERT and Beyond [53.83210899683987]
This survey provides an overview of text ranking with neural network architectures known as transformers.
The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing.
arXiv Detail & Related papers (2020-10-13T15:20:32Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.