Related papers: uOttawa at LegalLens-2024: Transformer-based Classification Experiments

uOttawa at LegalLens-2024: Transformer-based Classification Experiments

URL: http://arxiv.org/abs/2410.21139v2
Date: Thu, 31 Oct 2024 03:29:15 GMT
Title: uOttawa at LegalLens-2024: Transformer-based Classification Experiments
Authors: Nima Meghdadi, Diana Inkpen,
Abstract summary: This paper presents the methods used for LegalLens-2024 shared task. It focused on detecting legal violations within unstructured textual data and associating these violations with potentially affected individuals. Our results were 86.3% in the L-NER subtask and 88.25% in the L-NLI subtask.
Score: 7.710171464368831
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents the methods used for LegalLens-2024 shared task, which focused on detecting legal violations within unstructured textual data and associating these violations with potentially affected individuals. The shared task included two subtasks: A) Legal Named Entity Recognition (L-NER) and B) Legal Natural Language Inference (L-NLI). For subtask A, we utilized the spaCy library, while for subtask B, we employed a combined model incorporating RoBERTa and CNN. Our results were 86.3% in the L-NER subtask and 88.25% in the L-NLI subtask. Overall, our paper demonstrates the effectiveness of transformer models in addressing complex tasks in the legal domain. The source code for our implementation is publicly available at https://github.com/NimaMeghdadi/uOttawa-at-LegalLens-2024-Transformer-based-Classification

Related papers

Bonafide at LegalLens 2024 Shared Task: Using Lightweight DeBERTa Based Encoder For Legal Violation Detection and Resolution [1.2283121128307906]
We present two systems -- Named Entity Resolution (NER) and Natural Language Inference (NLI) -- for detecting legal violations within unstructured textual data. Both systems are lightweight DeBERTa based encoders that outperform the LLM baselines.
arXiv Detail & Related papers (2024-10-30T12:42:38Z)
Bayesian scaling laws for in-context learning [72.17734205418502]
In-context learning (ICL) is a powerful technique for getting language models to perform complex tasks with no training updates. We show that ICL approximates a Bayesian learner and develop a family of novel Bayesian scaling laws for ICL.
arXiv Detail & Related papers (2024-10-21T21:45:22Z)
LegalLens Shared Task 2024: Legal Violation Identification in Unstructured Text [7.839638824275218]
This paper focuses on detecting legal violations within text in the wild across two sub-tasks: LegalLens-NER for identifying legal violation entities and LegalLens-NLI for associating these violations with relevant legal contexts and affected individuals. The top-performing team achieved a 7.11% improvement in NER over the baseline, while NLI saw a more marginal improvement of 5.7%.
arXiv Detail & Related papers (2024-10-15T21:02:44Z)
Mitigating Copy Bias in In-Context Learning through Neuron Pruning [74.91243772654519]
Large language models (LLMs) have demonstrated impressive few-shot in-context learning abilities. They are sometimes prone to a copying bias', where they copy answers from provided examples instead of learning the underlying patterns. We propose a novel and simple method to mitigate such copying bias.
arXiv Detail & Related papers (2024-10-02T07:18:16Z)
InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws. We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries. InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z)
NeCo@ALQAC 2023: Legal Domain Knowledge Acquisition for Low-Resource Languages through Data Enrichment [2.441072488254427]
This paper presents NeCo Team's solutions to the Vietnamese text processing tasks provided in the Automated Legal Question Answering Competition 2023 (ALQAC 2023) Our methods for the legal document retrieval task employ a combination of similarity ranking and deep learning models, while for the second task, we propose a range of adaptive techniques to handle different question types. Our approaches achieve outstanding results on both tasks of the competition, demonstrating the potential benefits and effectiveness of question answering systems in the legal field.
arXiv Detail & Related papers (2023-09-11T14:43:45Z)
SemEval 2023 Task 6: LegalEval - Understanding Legal Texts [2.172613863157655]
There is a need for developing NLP-based techniques for processing and automatically understanding legal documents. LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document, Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case. In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for
arXiv Detail & Related papers (2023-04-19T10:28:32Z)
Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization [63.21819285337555]
We show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples. We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries. We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization.
arXiv Detail & Related papers (2022-05-12T10:43:42Z)
Visual Transformer for Task-aware Active Learning [49.903358393660724]
We present a novel pipeline for pool-based Active Learning. Our method exploits accessible unlabelled examples during training to estimate their co-relation with the labelled examples. Visual Transformer models non-local visual concept dependency between labelled and unlabelled examples.
arXiv Detail & Related papers (2021-06-07T17:13:59Z)
Legal Document Classification: An Application to Law Area Prediction of Petitions to Public Prosecution Service [6.696983725360808]
This paper proposes the use of NLP techniques for textual classification. Our main goal is to automate the process of assigning petitions to their respective areas of law. The best results were obtained with a combination of Word2Vec trained on a domain-specific corpus and a Recurrent Neural Network architecture.
arXiv Detail & Related papers (2020-10-13T18:05:37Z)
Local Additivity Based Data Augmentation for Semi-supervised NER [59.90773003737093]
Named Entity Recognition (NER) is one of the first stages in deep language understanding. Current NER models heavily rely on human-annotated data. We propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER.
arXiv Detail & Related papers (2020-10-04T20:46:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.