Transformer-based Entity Legal Form Classification
- URL: http://arxiv.org/abs/2310.12766v1
- Date: Thu, 19 Oct 2023 14:11:43 GMT
- Title: Transformer-based Entity Legal Form Classification
- Authors: Alexander Arimond and Mauro Molteni and Dominik Jany and Zornitsa
Manolova and Damian Borth and Andreas G.F. Hoepner
- Abstract summary: We propose the application of Transformer-based language models for classifying legal forms.
We employ various BERT variants and compare their performance against multiple traditional baselines.
Our findings demonstrate that pre-trained BERT variants outperform traditional text classification approaches in terms of F1 score.
- Score: 43.75590166844617
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We propose the application of Transformer-based language models for
classifying entity legal forms from raw legal entity names. Specifically, we
employ various BERT variants and compare their performance against multiple
traditional baselines. Our evaluation encompasses a substantial subset of
freely available Legal Entity Identifier (LEI) data, comprising over 1.1
million legal entities from 30 different legal jurisdictions. The ground truth
labels for classification per jurisdiction are taken from the Entity Legal Form
(ELF) code standard (ISO 20275). Our findings demonstrate that pre-trained BERT
variants outperform traditional text classification approaches in terms of F1
score, while also performing comparably well in the Macro F1 Score. Moreover,
the validity of our proposal is supported by the outcome of third-party expert
reviews conducted in ten selected jurisdictions. This study highlights the
significant potential of Transformer-based models in advancing data
standardization and data integration. The presented approaches can greatly
benefit financial institutions, corporations, governments and other
organizations in assessing business relationships, understanding risk exposure,
and promoting effective governance.
Related papers
- LegalPro-BERT: Classification of Legal Provisions by fine-tuning BERT Large Language Model [0.0]
Contract analysis requires the identification and classification of key provisions and paragraphs within an agreement.
LegalPro-BERT is a BERT transformer architecture model that we fine- tune to efficiently handle classification task for legal provisions.
arXiv Detail & Related papers (2024-04-15T19:08:48Z) - Query-driven Relevant Paragraph Extraction from Legal Judgments [1.2562034805037443]
Legal professionals often grapple with navigating lengthy legal judgements to pinpoint information that directly address their queries.
This paper focus on this task of extracting relevant paragraphs from legal judgements based on the query.
We construct a specialized dataset for this task from the European Court of Human Rights (ECtHR) using the case law guides.
arXiv Detail & Related papers (2024-03-31T08:03:39Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - The Right Model for the Job: An Evaluation of Legal Multi-Label
Classification Baselines [4.5054837824245215]
Multi-Label Classification (MLC) is a common task in the legal domain, where more than one label may be assigned to a legal document.
In this work, we perform an evaluation of different MLC methods using two public legal datasets.
arXiv Detail & Related papers (2024-01-22T11:15:07Z) - Identification of Regulatory Requirements Relevant to Business
Processes: A Comparative Study on Generative AI, Embedding-based Ranking,
Crowd and Expert-driven Methods [10.899912290518648]
This work examines how legal and domain experts can be assisted in the assessment of relevant requirements.
We compare an embedding-based NLP ranking method, a generative AI method using GPT-4, and a crowdsourced method with the purely manual method of creating labels by experts.
A gold standard is created for both BPMN2.0 processes and matched to real-world requirements from multiple regulatory documents.
arXiv Detail & Related papers (2024-01-02T12:08:31Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Flexible categorization for auditing using formal concept analysis and
Dempster-Shafer theory [55.878249096379804]
We study different ways to categorize according to different extents of interest in different financial accounts.
The framework developed in this paper provides a formal ground to obtain and study explainable categorizations.
arXiv Detail & Related papers (2022-10-31T13:49:16Z) - Analysing similarities between legal court documents using natural
language processing approaches based on Transformers [0.0]
This work targets the problem of detecting the degree of similarity between judicial documents that can be achieved in the inference group.
It applies six NLP techniques based on the transformers architecture to a case study of legal proceedings in the Brazilian judicial system.
arXiv Detail & Related papers (2022-04-14T18:25:56Z) - Equality before the Law: Legal Judgment Consistency Analysis for
Fairness [55.91612739713396]
In this paper, we propose an evaluation metric for judgment inconsistency, Legal Inconsistency Coefficient (LInCo)
We simulate judges from different groups with legal judgment prediction (LJP) models and measure the judicial inconsistency with the disagreement of the judgment results given by LJP models trained on different groups.
We employ LInCo to explore the inconsistency in real cases and come to the following observations: (1) Both regional and gender inconsistency exist in the legal system, but gender inconsistency is much less than regional inconsistency.
arXiv Detail & Related papers (2021-03-25T14:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.