MUSER: A Multi-View Similar Case Retrieval Dataset
- URL: http://arxiv.org/abs/2310.15602v1
- Date: Tue, 24 Oct 2023 08:17:11 GMT
- Title: MUSER: A Multi-View Similar Case Retrieval Dataset
- Authors: Qingquan Li and Yiran Hu and Feng Yao and Chaojun Xiao and Zhiyuan Liu
and Maosong Sun and Weixing Shen
- Abstract summary: Similar case retrieval (SCR) is a representative legal AI application that plays a pivotal role in promoting judicial fairness.
Existing SCR datasets only focus on the fact description section when judging the similarity between cases.
We present M, a similar case retrieval dataset based on multi-view similarity measurement and comprehensive legal element with sentence-level legal element annotations.
- Score: 65.36779942237357
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Similar case retrieval (SCR) is a representative legal AI application that
plays a pivotal role in promoting judicial fairness. However, existing SCR
datasets only focus on the fact description section when judging the similarity
between cases, ignoring other valuable sections (e.g., the court's opinion)
that can provide insightful reasoning process behind. Furthermore, the case
similarities are typically measured solely by the textual semantics of the fact
descriptions, which may fail to capture the full complexity of legal cases from
the perspective of legal knowledge. In this work, we present MUSER, a similar
case retrieval dataset based on multi-view similarity measurement and
comprehensive legal element with sentence-level legal element annotations.
Specifically, we select three perspectives (legal fact, dispute focus, and law
statutory) and build a comprehensive and structured label schema of legal
elements for each of them, to enable accurate and knowledgeable evaluation of
case similarities. The constructed dataset originates from Chinese civil cases
and contains 100 query cases and 4,024 candidate cases. We implement several
text classification algorithms for legal element prediction and various
retrieval methods for retrieving similar cases on MUSER. The experimental
results indicate that incorporating legal elements can benefit the performance
of SCR models, but further efforts are still required to address the remaining
challenges posed by MUSER. The source code and dataset are released at
https://github.com/THUlawtech/MUSER.
Related papers
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs [67.54302101989542]
Legal case retrieval aims to provide similar cases as references for a given fact description.
Existing works mainly focus on case-to-case retrieval using lengthy queries.
Data scale is insufficient to satisfy the training requirements of existing data-hungry neural models.
arXiv Detail & Related papers (2024-10-09T06:26:39Z) - SparseCL: Sparse Contrastive Learning for Contradiction Retrieval [87.02936971689817]
Contradiction retrieval refers to identifying and extracting documents that explicitly disagree with or refute the content of a query.
Existing methods such as similarity search and crossencoder models exhibit significant limitations.
We introduce SparseCL that leverages specially trained sentence embeddings designed to preserve subtle, contradictory nuances between sentences.
arXiv Detail & Related papers (2024-06-15T21:57:03Z) - DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval.
We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability.
Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z) - An Intent Taxonomy of Legal Case Retrieval [43.22489520922202]
Legal case retrieval is a special Information Retrieval(IR) task focusing on legal case documents.
We present a novel hierarchical intent taxonomy of legal case retrieval.
We reveal significant differences in user behavior and satisfaction under different search intents in legal case retrieval.
arXiv Detail & Related papers (2023-07-25T07:27:32Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Exploiting Contrastive Learning and Numerical Evidence for Confusing
Legal Judgment Prediction [46.71918729837462]
Given the fact description text of a legal case, legal judgment prediction aims to predict the case's charge, law article and penalty term.
Previous studies fail to distinguish different classification errors with a standard cross-entropy classification loss.
We propose a moco-based supervised contrastive learning to learn distinguishable representations.
We further enhance the representation of the fact description with extracted crime amounts which are encoded by a pre-trained numeracy model.
arXiv Detail & Related papers (2022-11-15T15:53:56Z) - Legal Element-oriented Modeling with Multi-view Contrastive Learning for
Legal Case Retrieval [3.909749182759558]
We propose an interaction-focused network for legal case retrieval with a multi-view contrastive learning objective.
Case-view contrastive learning minimizes the hidden space distance between relevant legal case representations.
We employ a legal element knowledge-aware indicator to detect legal elements of cases.
arXiv Detail & Related papers (2022-10-11T06:47:23Z) - Aspect Classification for Legal Depositions [0.0]
It is important to know not only about liability, but also about events, accidents, physical conditions, and treatments.
A legal deposition consists of various aspects that are discussed as part of the deponent testimony.
Our methods have achieved a classification F1 score of 0.83.
arXiv Detail & Related papers (2020-09-09T18:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.