Towards Grammatical Tagging for the Legal Language of Cybersecurity
- URL: http://arxiv.org/abs/2306.17042v1
- Date: Thu, 29 Jun 2023 15:39:20 GMT
- Title: Towards Grammatical Tagging for the Legal Language of Cybersecurity
- Authors: Gianpietro Castiglione, Giampaolo Bella, Daniele Francesco Santamaria
- Abstract summary: Legal language can be understood as the language typically used by those engaged in the legal profession.
Recent legislation on cybersecurity obviously uses legal language in writing.
This paper faces the challenge of the essential interpretation of the legal language of cybersecurity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Legal language can be understood as the language typically used by those
engaged in the legal profession and, as such, it may come both in spoken or
written form. Recent legislation on cybersecurity obviously uses legal language
in writing, thus inheriting all its interpretative complications due to the
typical abundance of cases and sub-cases as well as to the general richness in
detail. This paper faces the challenge of the essential interpretation of the
legal language of cybersecurity, namely of the extraction of the essential
Parts of Speech (POS) from the legal documents concerning cybersecurity. The
challenge is overcome by our methodology for POS tagging of legal language. It
leverages state-of-the-art open-source tools for Natural Language Processing
(NLP) as well as manual analysis to validate the outcomes of the tools. As a
result, the methodology is automated and, arguably, general for any legal
language following minor tailoring of the preprocessing step. It is
demonstrated over the most relevant EU legislation on cybersecurity, namely on
the NIS 2 directive, producing the first, albeit essential, structured
interpretation of such a relevant document. Moreover, our findings indicate
that tools such as SpaCy and ClausIE reach their limits over the legal language
of the NIS 2.
Related papers
- InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - Legal Documents Drafting with Fine-Tuned Pre-Trained Large Language Model [1.3812010983144798]
This paper shows that we can leverage a large number of annotation-free legal documents without Chinese word segmentation to fine-tune a large-scale language model.
It can also achieve the generating legal document drafts task, and at the same time achieve the protection of information privacy and to improve information security issues.
arXiv Detail & Related papers (2024-06-06T16:00:20Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - LLM vs. Lawyers: Identifying a Subset of Summary Judgments in a Large UK
Case Law Dataset [0.0]
This study addresses the gap in the literature working with large legal corpora about how to isolate cases, in our case summary judgments, from a large corpus of UK court decisions.
We use the Cambridge Law Corpus of 356,011 UK court decisions and determine that the large language model achieves a weighted F1 score of 0.94 versus 0.78 for keywords.
We identify and extract 3,102 summary judgment cases, enabling us to map their distribution across various UK courts over a temporal span.
arXiv Detail & Related papers (2024-03-04T10:13:30Z) - Large Language Models and Explainable Law: a Hybrid Methodology [44.99833362998488]
The paper advocates for LLMs to enhance the accessibility, usage and explainability of rule-based legal systems.
A methodology is developed to explore the potential use of LLMs for translating the explanations produced by rule-based systems.
arXiv Detail & Related papers (2023-11-20T14:47:20Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - An automated method for the ontological representation of security
directives [0.0]
The paper frames this problem in the context of recent European security directives.
The complexity of their language is here thwarted by the extraction of the relevant information, namely of the parts of speech from each clause.
The method is showcased on a practical problem, namely to derive an ontology representing the NIS 2 directive, which is the peak of cybersecurity prescripts at the European level.
arXiv Detail & Related papers (2023-06-30T09:04:47Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - LexGLUE: A Benchmark Dataset for Legal Language Understanding in English [15.026117429782996]
We introduce the Legal General Language Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks.
We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.
arXiv Detail & Related papers (2021-10-03T10:50:51Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z) - A Dataset for Statutory Reasoning in Tax Law Entailment and Question
Answering [37.66486350122862]
This paper investigates the performance of natural language understanding approaches on statutory reasoning.
We introduce a dataset, together with a legal-domain text corpus.
We contrast this with a hand-constructed Prolog-based system, designed to fully solve the task.
arXiv Detail & Related papers (2020-05-11T16:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.