CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal
Assistive Writing
- URL: http://arxiv.org/abs/2305.03508v1
- Date: Wed, 3 May 2023 04:20:56 GMT
- Title: CiteCaseLAW: Citation Worthiness Detection in Caselaw for Legal
Assistive Writing
- Authors: Mann Khatri, Pritish Wadhwa, Gitansh Satija, Reshma Sheik, Yaman
Kumar, Rajiv Ratn Shah, Ponnurangam Kumaraguru
- Abstract summary: We introduce a labeled dataset of 178M sentences for citation-worthiness detection in the legal domain from the Caselaw Access Project (CAP)
The performance of various deep learning models was examined on this novel dataset.
The domain-specific pre-trained model tends to outperform other models, with an 88% F1-score for the citation-worthiness detection task.
- Score: 44.75251805925605
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In legal document writing, one of the key elements is properly citing the
case laws and other sources to substantiate claims and arguments. Understanding
the legal domain and identifying appropriate citation context or cite-worthy
sentences are challenging tasks that demand expensive manual annotation. The
presence of jargon, language semantics, and high domain specificity makes legal
language complex, making any associated legal task hard for automation. The
current work focuses on the problem of citation-worthiness identification. It
is designed as the initial step in today's citation recommendation systems to
lighten the burden of extracting an adequate set of citation contexts. To
accomplish this, we introduce a labeled dataset of 178M sentences for
citation-worthiness detection in the legal domain from the Caselaw Access
Project (CAP). The performance of various deep learning models was examined on
this novel dataset. The domain-specific pre-trained model tends to outperform
other models, with an 88% F1-score for the citation-worthiness detection task.
Related papers
- ALiiCE: Evaluating Positional Fine-grained Citation Generation [54.19617927314975]
We propose ALiiCE, the first automatic evaluation framework for fine-grained citation generation.
Our framework first parses the sentence claim into atomic claims via dependency analysis and then calculates citation quality at the atomic claim level.
We evaluate the positional fine-grained citation generation performance of several Large Language Models on two long-form QA datasets.
arXiv Detail & Related papers (2024-06-19T09:16:14Z) - Judgement Citation Retrieval using Contextual Similarity [0.0]
We propose a methodology that combines natural language processing (NLP) and machine learning techniques to enhance the organization and utilization of legal case descriptions.
Our methodology addresses two primary objectives: unsupervised clustering and supervised citation retrieval.
Our methodology achieved an impressive accuracy rate of 90.9%.
arXiv Detail & Related papers (2024-05-28T04:22:28Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Deep Graph Learning for Anomalous Citation Detection [55.81334139806342]
We propose a novel deep graph learning model, namely GLAD (Graph Learning for Anomaly Detection), to identify anomalies in citation networks.
Within the GLAD framework, we propose an algorithm called CPU (Citation PUrpose) to discover the purpose of citation based on citation texts.
arXiv Detail & Related papers (2022-02-23T09:05:28Z) - Towards generating citation sentences for multiple references with
intent control [86.53829532976303]
We build a novel generation model with the Fusion-in-Decoder approach to cope with multiple long inputs.
Experiments demonstrate that the proposed approaches provide much more comprehensive features for generating citation sentences.
arXiv Detail & Related papers (2021-12-02T15:32:24Z) - Important Sentence Identification in Legal Cases Using Multi-Class
Classification [0.1499944454332829]
This research explores the usage of sentence embeddings for multi-class classification to identify important sentences in a legal case.
A task-specific loss function is defined in order to improve the accuracy restricted by the straightforward use of categorical cross entropy loss.
arXiv Detail & Related papers (2021-11-10T14:58:29Z) - VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case
Law [12.080138272647144]
This paper presents a new dataset that consists of the citation graph of court opinions.
We focus on the verbatim quotes, where the text of the original opinion is directly reused.
We introduce the task of highlight extraction as a single-document summarization task based on the citation graph.
arXiv Detail & Related papers (2021-08-23T12:41:41Z) - CiteWorth: Cite-Worthiness Detection for Improved Scientific Document
Understanding [23.930041685595775]
We present an in-depth study of cite-worthiness detection in English, where a sentence is labelled for whether or not it cites an external source.
CiteWorth is high-quality, challenging, and suitable for studying problems such as domain adaptation.
arXiv Detail & Related papers (2021-05-23T11:08:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.