LawThinker: A Deep Research Legal Agent in Dynamic Environments
- URL: http://arxiv.org/abs/2602.12056v1
- Date: Thu, 12 Feb 2026 15:19:11 GMT
- Title: LawThinker: A Deep Research Legal Agent in Dynamic Environments
- Authors: Xinyu Yang, Chenlong Deng, Tongyu Wen, Binyu Xie, Zhicheng Dou,
- Abstract summary: LawThinker is an autonomous legal research agent.<n>It enforces verification as an atomic operation after every knowledge exploration step.<n>LawThinker achieves a 24% improvement over direct reasoning.
- Score: 51.782293183431676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Legal reasoning requires not only correct outcomes but also procedurally compliant reasoning processes. However, existing methods lack mechanisms to verify intermediate reasoning steps, allowing errors such as inapplicable statute citations to propagate undetected through the reasoning chain. To address this, we propose LawThinker, an autonomous legal research agent that adopts an Explore-Verify-Memorize strategy for dynamic judicial environments. The core idea is to enforce verification as an atomic operation after every knowledge exploration step. A DeepVerifier module examines each retrieval result along three dimensions of knowledge accuracy, fact-law relevance, and procedural compliance, with a memory module for cross-round knowledge reuse in long-horizon tasks. Experiments on the dynamic benchmark J1-EVAL show that LawThinker achieves a 24% improvement over direct reasoning and an 11% gain over workflow-based methods, with particularly strong improvements on process-oriented metrics. Evaluations on three static benchmarks further confirm its generalization capability. The code is available at https://github.com/yxy-919/LawThinker-agent .
Related papers
- LegalOne: A Family of Foundation Models for Reliable Legal Reasoning [54.57434222018289]
We present LegalOne, a family of foundational models specifically tailored for the Chinese legal domain.<n>LegalOne is developed through a comprehensive three-phase pipeline designed to master legal reasoning.<n>We publicly release the LegalOne weights and the LegalKit evaluation framework to advance the field of Legal AI.
arXiv Detail & Related papers (2026-01-31T10:18:32Z) - Gaming the Judge: Unfaithful Chain-of-Thought Can Undermine Agent Evaluation [76.5533899503582]
Large language models (LLMs) are increasingly used as judges to evaluate agent performance.<n>We show this paradigm implicitly assumes that the agent's chain-of-thought (CoT) reasoning faithfully reflects both its internal reasoning and the underlying environment state.<n>We demonstrate that manipulated reasoning alone can inflate false positive rates of state-of-the-art VLM judges by up to 90% across 800 trajectories spanning diverse web tasks.
arXiv Detail & Related papers (2026-01-21T06:07:43Z) - AppellateGen: A Benchmark for Appellate Legal Judgment Generation [30.9030336647868]
We introduce AppellateGen, a benchmark for second-instance legal judgment generation comprising 7,351 case pairs.<n>The task requires models to draft legally binding judgments by reasoning over the initial verdict and evidentiary updates.<n>We propose a judicial Standard Operating Procedure (SOP)-based Legal Multi-Agent System (SLMAS) to simulate judicial, which decomposes the generation process into discrete stages of issue identification, retrieval, and drafting.
arXiv Detail & Related papers (2026-01-04T02:15:17Z) - On Verifiable Legal Reasoning: A Multi-Agent Framework with Formalized Knowledge Representations [0.0]
This paper introduces a modular multi-agent framework that decomposes legal reasoning into distinct knowledge acquisition and application stages.<n>In the first stage, specialized agents extract legal concepts and formalize rules to create verifiable intermediate representations of statutes.<n>The second stage applies this knowledge to specific cases through three steps: analyzing queries to map case facts onto the schema, performing symbolic inference to derive logically entailed conclusions, and generating final answers.
arXiv Detail & Related papers (2025-08-31T06:03:00Z) - GLARE: Agentic Reasoning for Legal Judgment Prediction [60.13483016810707]
Legal judgment prediction (LJP) has become increasingly important in the legal field.<n>Existing large language models (LLMs) have significant problems of insufficient reasoning due to a lack of legal knowledge.<n>We introduce GLARE, an agentic legal reasoning framework that dynamically acquires key legal knowledge by invoking different modules.
arXiv Detail & Related papers (2025-08-22T13:38:12Z) - Can Language Models Discover Scaling Laws? [57.794209392781845]
This paper introduces SLDAgent, an evolution-based agent that co-optimize the scaling law model and the parameters, enabling it to autonomously explore complex relationships between variables.<n>For the first time, we demonstrate that SLDAgent can automatically discover laws that exhibit consistently more accurate extrapolation than their established, human-derived counterparts.
arXiv Detail & Related papers (2025-07-27T05:45:26Z) - CLAIM: An Intent-Driven Multi-Agent Framework for Analyzing Manipulation in Courtroom Dialogues [0.0]
Despite the growing advancements in NLP, its application in detecting and analyzing manipulation within the legal domain remains largely unexplored.<n>Our work addresses this gap by introducing LegalCon, a dataset of 1,063 annotated courtroom conversations labeled for manipulation detection.<n>We propose CLAIM, a two-stage, Intent-driven Multi-agent framework designed to enhance manipulation analysis by enabling context-aware and informed decision-making.
arXiv Detail & Related papers (2025-06-04T16:22:59Z) - RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models [58.69183479148083]
Legal Judgment Prediction (LJP) is a pivotal task in legal AI.<n>Existing LJP models integrate judicial precedents and legal knowledge for high performance.<n>But they neglect legal reasoning logic, a critical component of legal judgments requiring rigorous logical analysis.<n>This paper proposes a rule-enhanced legal judgment prediction framework based on first-order logic (FOL) formalism and comparative learning (CL)
arXiv Detail & Related papers (2025-05-27T14:50:21Z) - Reasoning Court: Combining Reasoning, Action, and Judgment for Multi-Hop Reasoning [17.829990749622496]
Reasoning Court (RC) is a novel framework that extends iterative reasoning-and-retrieval methods, such as ReAct, with a dedicated LLM judge.<n>RC consistently outperforms state-of-the-art few-shot prompting methods without task-specific fine-tuning.
arXiv Detail & Related papers (2025-04-14T00:56:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.