To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction
- URL: http://arxiv.org/abs/2408.02257v1
- Date: Mon, 5 Aug 2024 06:16:31 GMT
- Title: To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction
- Authors: Kemal Kurniawan, Meladel Mistica, Timothy Baldwin, Jey Han Lau,
- Abstract summary: We use a corpus of problem descriptions written by laypeople in English that is annotated by practising lawyers.
Inherent subjectivity exists in our task because legal area categorisation is a complex task.
- Score: 44.5492443909544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores the task of automatic prediction of text spans in a legal problem description that support a legal area label. We use a corpus of problem descriptions written by laypeople in English that is annotated by practising lawyers. Inherent subjectivity exists in our task because legal area categorisation is a complex task, and lawyers often have different views on a problem, especially in the face of legally-imprecise descriptions of issues. Experiments show that training on majority-voted spans outperforms training on disaggregated ones.
Related papers
- Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment [0.0]
This paper develops and applies a novel taxonomy for topic modelling summary judgment cases in the United Kingdom.
Using a curated dataset of summary judgment cases, we use the Large Language Model Claude 3 Opus to explore functional topics and trends.
We find that Claude 3 Opus correctly classified the topic with an accuracy of 87.10%.
arXiv Detail & Related papers (2024-05-21T16:30:25Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - MUSER: A Multi-View Similar Case Retrieval Dataset [65.36779942237357]
Similar case retrieval (SCR) is a representative legal AI application that plays a pivotal role in promoting judicial fairness.
Existing SCR datasets only focus on the fact description section when judging the similarity between cases.
We present M, a similar case retrieval dataset based on multi-view similarity measurement and comprehensive legal element with sentence-level legal element annotations.
arXiv Detail & Related papers (2023-10-24T08:17:11Z) - Interpretable Long-Form Legal Question Answering with
Retrieval-Augmented Large Language Models [10.834755282333589]
Long-form Legal Question Answering dataset comprises 1,868 expert-annotated legal questions in the French language.
Our experimental results demonstrate promising performance on automatic evaluation metrics.
As one of the only comprehensive, expert-annotated long-form LQA dataset, LLeQA has the potential to not only accelerate research towards resolving a significant real-world issue, but also act as a rigorous benchmark for evaluating NLP models in specialized domains.
arXiv Detail & Related papers (2023-09-29T08:23:19Z) - CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal
Assistive Writing [44.75251805925605]
We introduce a labeled dataset of 178M sentences for citation-worthiness detection in the legal domain from the Caselaw Access Project (CAP)
The performance of various deep learning models was examined on this novel dataset.
The domain-specific pre-trained model tends to outperform other models, with an 88% F1-score for the citation-worthiness detection task.
arXiv Detail & Related papers (2023-05-03T04:20:56Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Exploiting Contrastive Learning and Numerical Evidence for Confusing
Legal Judgment Prediction [46.71918729837462]
Given the fact description text of a legal case, legal judgment prediction aims to predict the case's charge, law article and penalty term.
Previous studies fail to distinguish different classification errors with a standard cross-entropy classification loss.
We propose a moco-based supervised contrastive learning to learn distinguishable representations.
We further enhance the representation of the fact description with extracted crime amounts which are encoded by a pre-trained numeracy model.
arXiv Detail & Related papers (2022-11-15T15:53:56Z) - Legal Element-oriented Modeling with Multi-view Contrastive Learning for
Legal Case Retrieval [3.909749182759558]
We propose an interaction-focused network for legal case retrieval with a multi-view contrastive learning objective.
Case-view contrastive learning minimizes the hidden space distance between relevant legal case representations.
We employ a legal element knowledge-aware indicator to detect legal elements of cases.
arXiv Detail & Related papers (2022-10-11T06:47:23Z) - Important Sentence Identification in Legal Cases Using Multi-Class
Classification [0.1499944454332829]
This research explores the usage of sentence embeddings for multi-class classification to identify important sentences in a legal case.
A task-specific loss function is defined in order to improve the accuracy restricted by the straightforward use of categorical cross entropy loss.
arXiv Detail & Related papers (2021-11-10T14:58:29Z) - Aspect Classification for Legal Depositions [0.0]
It is important to know not only about liability, but also about events, accidents, physical conditions, and treatments.
A legal deposition consists of various aspects that are discussed as part of the deponent testimony.
Our methods have achieved a classification F1 score of 0.83.
arXiv Detail & Related papers (2020-09-09T18:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.