CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case
Encoding
- URL: http://arxiv.org/abs/2305.05393v1
- Date: Tue, 9 May 2023 12:40:19 GMT
- Title: CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case
Encoding
- Authors: Yixiao Ma, Yueyue Wu, Weihang Su, Qingyao Ai, Yiqun Liu
- Abstract summary: CaseEncoder is a legal document encoder that leverages fine-grained legal knowledge in both the data sampling and pre-training phases.
CaseEncoder significantly outperforms both existing general pre-training models and legal-specific pre-training models in zero-shot legal case retrieval.
- Score: 15.685369142294693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Legal case retrieval is a critical process for modern legal information
systems. While recent studies have utilized pre-trained language models (PLMs)
based on the general domain self-supervised pre-training paradigm to build
models for legal case retrieval, there are limitations in using general domain
PLMs as backbones. Specifically, these models may not fully capture the
underlying legal features in legal case documents. To address this issue, we
propose CaseEncoder, a legal document encoder that leverages fine-grained legal
knowledge in both the data sampling and pre-training phases. In the data
sampling phase, we enhance the quality of the training data by utilizing
fine-grained law article information to guide the selection of positive and
negative examples. In the pre-training phase, we design legal-specific
pre-training tasks that align with the judging criteria of relevant legal
cases. Based on these tasks, we introduce an innovative loss function called
Biased Circle Loss to enhance the model's ability to recognize case relevance
in fine grains. Experimental results on multiple benchmarks demonstrate that
CaseEncoder significantly outperforms both existing general pre-training models
and legal-specific pre-training models in zero-shot legal case retrieval.
Related papers
- LawLLM: Law Large Language Model for the US Legal System [43.13850456765944]
We introduce the Law Large Language Model (LawLLM), a multi-task model specifically designed for the US legal domain.
LawLLM excels at Similar Case Retrieval (SCR), Precedent Case Recommendation (PCR), and Legal Judgment Prediction (LJP)
We propose customized data preprocessing techniques for each task that transform raw legal data into a trainable format.
arXiv Detail & Related papers (2024-07-27T21:51:30Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval [18.058942674792604]
We propose a novel few-shot workflow tailored to the relevant judgment of legal cases.
By comparing the relevance judgments of LLMs and human experts, we empirically show that we can obtain reliable relevance judgments.
arXiv Detail & Related papers (2024-03-27T09:46:56Z) - Towards Explainability in Legal Outcome Prediction Models [64.00172507827499]
We argue that precedent is a natural way of facilitating explainability for legal NLP models.
By developing a taxonomy of legal precedent, we are able to compare human judges and neural models.
We find that while the models learn to predict outcomes reasonably well, their use of precedent is unlike that of human judges.
arXiv Detail & Related papers (2024-03-25T15:15:41Z) - PILOT: Legal Case Outcome Prediction with Case Law [43.680862577060765]
We identify two unique challenges in making legal case outcome predictions with case law.
First, it is crucial to identify relevant precedent cases that serve as fundamental evidence for judges during decision-making.
Second, it is necessary to consider the evolution of legal principles over time, as early cases may adhere to different legal contexts.
arXiv Detail & Related papers (2024-01-28T21:18:05Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - Automated Refugee Case Analysis: An NLP Pipeline for Supporting Legal
Practitioners [0.0]
We introduce an end-to-end pipeline for retrieving, processing, and extracting targeted information from legal cases.
We investigate an under-studied legal domain with a case study on refugee law in Canada.
arXiv Detail & Related papers (2023-05-24T19:37:23Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Do Charge Prediction Models Learn Legal Theory? [59.74220430434435]
We argue that trustworthy charge prediction models should take legal theories into consideration.
We propose three principles for trustworthy models should follow in this task, which are sensitive, selective, and presumption of innocence.
Our findings indicate that, while existing charge prediction models meet the selective principle on a benchmark dataset, most of them are still not sensitive enough and do not satisfy the presumption of innocence.
arXiv Detail & Related papers (2022-10-31T07:32:12Z) - Legal Element-oriented Modeling with Multi-view Contrastive Learning for
Legal Case Retrieval [3.909749182759558]
We propose an interaction-focused network for legal case retrieval with a multi-view contrastive learning objective.
Case-view contrastive learning minimizes the hidden space distance between relevant legal case representations.
We employ a legal element knowledge-aware indicator to detect legal elements of cases.
arXiv Detail & Related papers (2022-10-11T06:47:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.