FlairNLP at SemEval-2023 Task 6b: Extraction of Legal Named Entities
from Legal Texts using Contextual String Embeddings
- URL: http://arxiv.org/abs/2306.02182v1
- Date: Sat, 3 Jun 2023 19:38:04 GMT
- Title: FlairNLP at SemEval-2023 Task 6b: Extraction of Legal Named Entities
from Legal Texts using Contextual String Embeddings
- Authors: Vinay N Ramesh, Rohan Eswara
- Abstract summary: We employ knowledge extraction techniques, specially the named entity extraction of legal entities within court case judgements.
We evaluate several state of the art architectures in the realm of sequence labeling using models trained on a curated dataset of legal texts.
A Bi-LSTM model trained on Flair Embeddings achieves the best results.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Indian court legal texts and processes are essential towards the integrity of
the judicial system and towards maintaining the social and political order of
the nation. Due to the increase in number of pending court cases, there is an
urgent need to develop tools to automate many of the legal processes with the
knowledge of artificial intelligence. In this paper, we employ knowledge
extraction techniques, specially the named entity extraction of legal entities
within court case judgements. We evaluate several state of the art
architectures in the realm of sequence labeling using models trained on a
curated dataset of legal texts. We observe that a Bi-LSTM model trained on
Flair Embeddings achieves the best results, and we also publish the BIO
formatted dataset as part of this paper.
Related papers
- InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - Judgement Citation Retrieval using Contextual Similarity [0.0]
We propose a methodology that combines natural language processing (NLP) and machine learning techniques to enhance the organization and utilization of legal case descriptions.
Our methodology addresses two primary objectives: unsupervised clustering and supervised citation retrieval.
Our methodology achieved an impressive accuracy rate of 90.9%.
arXiv Detail & Related papers (2024-05-28T04:22:28Z) - Enhancing Pre-Trained Language Models with Sentence Position Embeddings
for Rhetorical Roles Recognition in Legal Opinions [0.16385815610837165]
The size of legal opinions continues to grow, making it increasingly challenging to develop a model that can accurately predict the rhetorical roles of legal opinions.
We propose a novel model architecture for automatically predicting rhetorical roles using pre-trained language models (PLMs) enhanced with knowledge of sentence position information.
Based on an annotated corpus from the LegalEval@SemEval2023 competition, we demonstrate that our approach requires fewer parameters, resulting in lower computational costs.
arXiv Detail & Related papers (2023-10-08T20:33:55Z) - LEEC: A Legal Element Extraction Dataset with an Extensive
Domain-Specific Label System [0.4764641468273235]
Legal Element ExtraCtion dataset (LEEC) represents the most extensive and domain-specific legal element extraction dataset for the Chinese legal system.
We introduce a more comprehensive, large-scale criminal element extraction dataset, comprising 15,831 judicial documents and 159 labels.
arXiv Detail & Related papers (2023-10-02T15:16:31Z) - Datasets for Portuguese Legal Semantic Textual Similarity: Comparing
weak supervision and an annotation process approaches [1.9244230111838758]
Brazilian National Council of Justice has established in Resolution 469/2022 formal guidance for document and process digitalization.
This article contributes with four datasets from the legal domain, two with documents and metadata but unlabeled, and another two labeled with a aiming at its use in textual similarity tasks.
The analysis of ground truth labels highlights that semantic analysis of domain text can be challenging even for domain experts.
arXiv Detail & Related papers (2023-05-29T18:27:10Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Indian Legal Text Summarization: A Text Normalisation-based Approach [0.0]
There are more than 4 crore cases outstanding in the Indian court system.
Many state-theart models for text summarization have emerged as machine learning has progressed.
domain-independent models don't do well with legal texts.
Authors have proposed a methodology for normalising legal texts in the Indian context.
arXiv Detail & Related papers (2022-06-13T15:16:50Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z) - \textit{StateCensusLaws.org}: A Web Application for Consuming and
Annotating Legal Discourse Learning [89.77347919191774]
We create a web application to highlight the output of NLP models trained to parse and label discourse segments in law text.
We focus on state-level law that uses U.S. Census population numbers to allocate resources and organize government.
arXiv Detail & Related papers (2021-04-20T22:00:54Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z) - How Does NLP Benefit Legal System: A Summary of Legal Artificial
Intelligence [81.04070052740596]
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain.
This paper introduces the history, the current state, and the future directions of research in LegalAI.
arXiv Detail & Related papers (2020-04-25T14:45:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.