The Right Model for the Job: An Evaluation of Legal Multi-Label
Classification Baselines
- URL: http://arxiv.org/abs/2401.11852v1
- Date: Mon, 22 Jan 2024 11:15:07 GMT
- Title: The Right Model for the Job: An Evaluation of Legal Multi-Label
Classification Baselines
- Authors: Martina Forster, Claudia Schulz, Prudhvi Nokku, Melicaalsadat
Mirsafian, Jaykumar Kasundra, Stavroula Skylaki
- Abstract summary: Multi-Label Classification (MLC) is a common task in the legal domain, where more than one label may be assigned to a legal document.
In this work, we perform an evaluation of different MLC methods using two public legal datasets.
- Score: 4.5054837824245215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-Label Classification (MLC) is a common task in the legal domain, where
more than one label may be assigned to a legal document. A wide range of
methods can be applied, ranging from traditional ML approaches to the latest
Transformer-based architectures. In this work, we perform an evaluation of
different MLC methods using two public legal datasets, POSTURE50K and
EURLEX57K. By varying the amount of training data and the number of labels, we
explore the comparative advantage offered by different approaches in relation
to the dataset properties. Our findings highlight DistilRoBERTa and LegalBERT
as performing consistently well in legal MLC with reasonable computational
demands. T5 also demonstrates comparable performance while offering advantages
as a generative model in the presence of changing label sets. Finally, we show
that the CrossEncoder exhibits potential for notable macro-F1 score
improvements, albeit with increased computational costs.
Related papers
- Interpretable Cross-Examination Technique (ICE-T): Using highly informative features to boost LLM performance [1.1961645395911131]
In domains where interpretability is crucial, such as medicine and law, standard models often fall short due to their "black-box" nature.
ICE-T addresses these limitations by using a series of generated prompts that allow an LLM to approach the problem from multiple directions.
We demonstrate the effectiveness of ICE-T across a diverse set of data sources, including medical records and legal documents.
arXiv Detail & Related papers (2024-05-08T19:20:34Z) - COCO is "ALL'' You Need for Visual Instruction Fine-tuning [39.438410070172125]
Visual instruction fine-tuning (IFT) is a vital process for aligning MLLMs' output with user's intentions.
Recent studies propose to construct visual IFT datasets through a multifaceted approach.
We establish a new IFT dataset, with images sourced from the COCO dataset along with more diverse instructions.
arXiv Detail & Related papers (2024-01-17T04:43:45Z) - Transformer-based Entity Legal Form Classification [43.75590166844617]
We propose the application of Transformer-based language models for classifying legal forms.
We employ various BERT variants and compare their performance against multiple traditional baselines.
Our findings demonstrate that pre-trained BERT variants outperform traditional text classification approaches in terms of F1 score.
arXiv Detail & Related papers (2023-10-19T14:11:43Z) - Retrieval-augmented Multi-label Text Classification [20.100081284294973]
Multi-label text classification is a challenging task in settings of large label sets.
Retrieval augmentation aims to improve the sample efficiency of classification models.
We evaluate this approach on four datasets from the legal and biomedical domains.
arXiv Detail & Related papers (2023-05-22T14:16:23Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - MemSAC: Memory Augmented Sample Consistency for Large Scale Unsupervised
Domain Adaptation [71.4942277262067]
We propose MemSAC, which exploits sample level similarity across source and target domains to achieve discriminative transfer.
We provide in-depth analysis and insights into the effectiveness of MemSAC.
arXiv Detail & Related papers (2022-07-25T17:55:28Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Visual Transformer for Task-aware Active Learning [49.903358393660724]
We present a novel pipeline for pool-based Active Learning.
Our method exploits accessible unlabelled examples during training to estimate their co-relation with the labelled examples.
Visual Transformer models non-local visual concept dependency between labelled and unlabelled examples.
arXiv Detail & Related papers (2021-06-07T17:13:59Z) - An Empirical Study on Large-Scale Multi-Label Text Classification
Including Few and Zero-Shot Labels [49.036212158261215]
Large-scale Multi-label Text Classification (LMTC) has a wide range of Natural Language Processing (NLP) applications.
Current state-of-the-art LMTC models employ Label-Wise Attention Networks (LWANs)
We show that hierarchical methods based on Probabilistic Label Trees (PLTs) outperform LWANs.
We propose a new state-of-the-art method which combines BERT with LWANs.
arXiv Detail & Related papers (2020-10-04T18:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.