Are manual annotations necessary for statutory interpretations retrieval?
- URL: http://arxiv.org/abs/2506.13965v1
- Date: Mon, 16 Jun 2025 20:15:57 GMT
- Title: Are manual annotations necessary for statutory interpretations retrieval?
- Authors: Aleksander Smywiński-Pohl, Tomer Libal, Adam Kaczmarczyk, Magdalena Król,
- Abstract summary: We try to determine the optimal number of annotations per a legal concept.<n>We also check if we can draw the sentences for annotation randomly or there is a gain in the performance of the model.
- Score: 41.94295877935867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the elements of legal research is looking for cases where judges have extended the meaning of a legal concept by providing interpretations of what a concept means or does not mean. This allow legal professionals to use such interpretations as precedents as well as laymen to better understand the legal concept. The state-of-the-art approach for retrieving the most relevant interpretations for these concepts currently depends on the ranking of sentences and the training of language models over annotated examples. That manual annotation process can be quite expensive and need to be repeated for each such concept, which prompted recent research in trying to automate this process. In this paper, we highlight the results of various experiments conducted to determine the volume, scope and even the need for manual annotation. First of all, we check what is the optimal number of annotations per a legal concept. Second, we check if we can draw the sentences for annotation randomly or there is a gain in the performance of the model, when only the best candidates are annotated. As the last question we check what is the outcome of automating the annotation process with the help of an LLM.
Related papers
- Automating Legal Interpretation with LLMs: Retrieval, Generation, and Evaluation [27.345475442620746]
ATRIE consists of a legal concept interpreter and a legal concept interpretation evaluator.<n>The quality of our interpretations is comparable to those written by legal experts, with superior comprehensiveness and readability.<n>Although there remains a slight gap in accuracy, it can already assist legal practitioners in improving the efficiency of legal interpretation.
arXiv Detail & Related papers (2025-01-03T10:11:38Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - Using Natural Language Explanations to Rescale Human Judgments [81.66697572357477]
We propose a method to rescale ordinal annotations and explanations using large language models (LLMs)
We feed annotators' Likert ratings and corresponding explanations into an LLM and prompt it to produce a numeric score anchored in a scoring rubric.
Our method rescales the raw judgments without impacting agreement and brings the scores closer to human judgments grounded in the same scoring rubric.
arXiv Detail & Related papers (2023-05-24T06:19:14Z) - Unlocking Practical Applications in Legal Domain: Evaluation of GPT for
Zero-Shot Semantic Annotation of Legal Texts [0.0]
We evaluate the capability of a state-of-the-art generative pre-trained transformer (GPT) model to perform semantic annotation of short text snippets.
We found that the GPT model performs surprisingly well in zero-shot settings on diverse types of documents.
arXiv Detail & Related papers (2023-05-08T01:55:53Z) - Exploiting Contrastive Learning and Numerical Evidence for Confusing
Legal Judgment Prediction [46.71918729837462]
Given the fact description text of a legal case, legal judgment prediction aims to predict the case's charge, law article and penalty term.
Previous studies fail to distinguish different classification errors with a standard cross-entropy classification loss.
We propose a moco-based supervised contrastive learning to learn distinguishable representations.
We further enhance the representation of the fact description with extracted crime amounts which are encoded by a pre-trained numeracy model.
arXiv Detail & Related papers (2022-11-15T15:53:56Z) - Fine-grained Intent Classification in the Legal Domain [2.088409822555567]
We introduce a dataset of 93 legal documents, belonging to the case categories of either Murder, Land Dispute, Robbery, or Corruption.
We annotate fine-grained intents for each such phrase to enable a deeper understanding of the case for a reader.
We analyze the performance of several transformer-based models in automating the process of extracting intent phrases.
arXiv Detail & Related papers (2022-05-06T23:57:17Z) - Sentence Embeddings and High-speed Similarity Search for Fast Computer
Assisted Annotation of Legal Documents [0.5249805590164901]
We introduce a proof-of-concept system for annotating sentences "laterally"
The approach is based on the observation that sentences that are similar in meaning often have the same label in terms of a particular type system.
arXiv Detail & Related papers (2021-12-21T19:27:21Z) - Discovering Explanatory Sentences in Legal Case Decisions Using
Pre-trained Language Models [0.7614628596146599]
Legal texts routinely use concepts that are difficult to understand.
Lawyers elaborate on the meaning of such concepts by, among other things, carefully investigating how have they been used in past.
Finding text snippets that mention a particular concept in a useful way is tedious, time-consuming, and, hence, expensive.
arXiv Detail & Related papers (2021-12-14T04:56:39Z) - Fine-Grained Opinion Summarization with Minimal Supervision [48.43506393052212]
FineSum aims to profile a target by extracting opinions from multiple documents.
FineSum automatically identifies opinion phrases from the raw corpus, classifies them into different aspects and sentiments, and constructs multiple fine-grained opinion clusters under each aspect/sentiment.
Both automatic evaluation on the benchmark and quantitative human evaluation validate the effectiveness of our approach.
arXiv Detail & Related papers (2021-10-17T15:16:34Z) - Annotation Curricula to Implicitly Train Non-Expert Annotators [56.67768938052715]
voluntary studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain.
This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations.
We propose annotation curricula, a novel approach to implicitly train annotators.
arXiv Detail & Related papers (2021-06-04T09:48:28Z) - Weakly- and Semi-supervised Evidence Extraction [107.47661281843232]
We propose new methods to combine few evidence annotations with abundant document-level labels for the task of evidence extraction.
Our approach yields substantial gains with as few as hundred evidence annotations.
arXiv Detail & Related papers (2020-11-03T04:05:00Z) - Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for
Post-Hoc Interpretability [54.85658598523915]
We propose to have a concrete definition of interpretation before we could evaluate faithfulness of an interpretation.
We find that although interpretation methods perform differently under a certain evaluation metric, such a difference may not result from interpretation quality or faithfulness.
arXiv Detail & Related papers (2020-09-16T06:38:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.