Related papers: CaseFacts: A Benchmark for Legal Fact-Checking and Precedent Retrieval

CaseFacts: A Benchmark for Legal Fact-Checking and Precedent Retrieval

URL: http://arxiv.org/abs/2601.17230v1
Date: Fri, 23 Jan 2026 23:41:46 GMT
Title: CaseFacts: A Benchmark for Legal Fact-Checking and Precedent Retrieval
Authors: Akshith Reddy Putta, Jacob Devasier, Chengkai Li,
Abstract summary: CaseFacts is a benchmark for verifying legal claims against U.S. Supreme Court precedents.<n>The dataset consists of 6,294 claims categorized as Supported, Refuted, or Overruled.
Score: 5.305110876082343
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automated Fact-Checking has largely focused on verifying general knowledge against static corpora, overlooking high-stakes domains like law where truth is evolving and technically complex. We introduce CaseFacts, a benchmark for verifying colloquial legal claims against U.S. Supreme Court precedents. Unlike existing resources that map formal texts to formal texts, CaseFacts challenges systems to bridge the semantic gap between layperson assertions and technical jurisprudence while accounting for temporal validity. The dataset consists of 6,294 claims categorized as Supported, Refuted, or Overruled. We construct this benchmark using a multi-stage pipeline that leverages Large Language Models (LLMs) to synthesize claims from expert case summaries, employing a novel semantic similarity heuristic to efficiently identify and verify complex legal overrulings. Experiments with state-of-the-art LLMs reveal that the task remains challenging; notably, augmenting models with unrestricted web search degrades performance compared to closed-book baselines due to the retrieval of noisy, non-authoritative precedents. We release CaseFacts to spur research into legal fact verification systems.

Related papers

LegalOne: A Family of Foundation Models for Reliable Legal Reasoning [54.57434222018289]
We present LegalOne, a family of foundational models specifically tailored for the Chinese legal domain.<n>LegalOne is developed through a comprehensive three-phase pipeline designed to master legal reasoning.<n>We publicly release the LegalOne weights and the LegalKit evaluation framework to advance the field of Legal AI.
arXiv Detail & Related papers (2026-01-31T10:18:32Z)
AppellateGen: A Benchmark for Appellate Legal Judgment Generation [30.9030336647868]
We introduce AppellateGen, a benchmark for second-instance legal judgment generation comprising 7,351 case pairs.<n>The task requires models to draft legally binding judgments by reasoning over the initial verdict and evidentiary updates.<n>We propose a judicial Standard Operating Procedure (SOP)-based Legal Multi-Agent System (SLMAS) to simulate judicial, which decomposes the generation process into discrete stages of issue identification, retrieval, and drafting.
arXiv Detail & Related papers (2026-01-04T02:15:17Z)
ReaKase-8B: Legal Case Retrieval via Knowledge and Reasoning Representations with LLMs [37.688405624086315]
A novel ReaKase-8B framework is proposed to leverage extracted legal facts, legal issues, legal relation triplets and legal reasoning for effective legal case retrieval.<n>Experiments on two benchmark datasets from COLIEE 2022 and COLIEE 2023 demonstrate that our knowledge and reasoning augmented embeddings substantially improve retrieval performance.
arXiv Detail & Related papers (2025-10-30T06:35:36Z)
ClaimGen-CN: A Large-scale Chinese Dataset for Legal Claim Generation [56.79698529022327]
Legal claims refer to the plaintiff's demands in a case and are essential to guiding judicial reasoning and case resolution.<n>This paper explores the problem of legal claim generation based on the given case's facts.<n>We construct ClaimGen-CN, the first dataset for Chinese legal claim generation task.
arXiv Detail & Related papers (2025-08-24T07:19:25Z)
A Law Reasoning Benchmark for LLM with Tree-Organized Structures including Factum Probandum, Evidence and Experiences [76.73731245899454]
We propose a transparent law reasoning schema enriched with hierarchical factum probandum, evidence, and implicit experience.<n>Inspired by this schema, we introduce the challenging task, which takes a textual case description and outputs a hierarchical structure justifying the final decision.<n>This benchmark paves the way for transparent and accountable AI-assisted law reasoning in the Intelligent Court''
arXiv Detail & Related papers (2025-03-02T10:26:54Z)
AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction [56.797874973414636]
AnnoCaseLaw is a first-of-its-kind dataset of 471 meticulously annotated U.S. Appeals Court negligence cases.<n>Our dataset lays the groundwork for more human-aligned, explainable Legal Judgment Prediction models.<n>Results demonstrate that LJP remains a formidable task, with application of legal precedent proving particularly difficult.
arXiv Detail & Related papers (2025-02-28T19:14:48Z)
LawLLM: Law Large Language Model for the US Legal System [43.13850456765944]
We introduce the Law Large Language Model (LawLLM), a multi-task model specifically designed for the US legal domain. LawLLM excels at Similar Case Retrieval (SCR), Precedent Case Recommendation (PCR), and Legal Judgment Prediction (LJP) We propose customized data preprocessing techniques for each task that transform raw legal data into a trainable format.
arXiv Detail & Related papers (2024-07-27T21:51:30Z)
Learning Interpretable Legal Case Retrieval via Knowledge-Guided Case Reformulation [22.85652668826498]
This paper introduces KELLER, a legal knowledge-guided case reformulation approach based on large language models (LLMs) By incorporating professional legal knowledge about crimes and law articles, we enable large language models to accurately reformulate the original legal case into concise sub-facts of crimes.
arXiv Detail & Related papers (2024-06-28T08:59:45Z)
DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval. We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability. Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z)
Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval [16.29803062332164]
We propose a few-shot approach where large language models assist in generating expert-aligned relevance judgments.<n>The proposed approach decomposes the judgment process into several stages, mimicking the workflow of human annotators.<n>It also ensures interpretable data labeling, providing transparency and clarity in the relevance assessment process.
arXiv Detail & Related papers (2024-03-27T09:46:56Z)
MUSER: A Multi-View Similar Case Retrieval Dataset [65.36779942237357]
Similar case retrieval (SCR) is a representative legal AI application that plays a pivotal role in promoting judicial fairness. Existing SCR datasets only focus on the fact description section when judging the similarity between cases. We present M, a similar case retrieval dataset based on multi-view similarity measurement and comprehensive legal element with sentence-level legal element annotations.
arXiv Detail & Related papers (2023-10-24T08:17:11Z)
SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system. Most existing language models have difficulty understanding the long-distance dependencies between different structures. We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.