Debate-Feedback: A Multi-Agent Framework for Efficient Legal Judgment Prediction
- URL: http://arxiv.org/abs/2504.05358v1
- Date: Mon, 07 Apr 2025 09:34:14 GMT
- Title: Debate-Feedback: A Multi-Agent Framework for Efficient Legal Judgment Prediction
- Authors: Xi Chen, Mao Mao, Shuo Li, Haotian Shangguan,
- Abstract summary: We propose a novel legal judgment prediction model based on the Debate-Feedback architecture.<n>Unlike traditional methods, our model achieves significant improvements in efficiency by minimizing the need for large historical datasets.
- Score: 7.196065223124077
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of AI in legal analysis and prediction (LegalAI) has gained widespread attention, with past research focusing on retrieval-based methods and fine-tuning large models. However, these approaches often require large datasets and underutilize the capabilities of modern large language models (LLMs). In this paper, inspired by the debate phase of real courtroom trials, we propose a novel legal judgment prediction model based on the Debate-Feedback architecture, which integrates LLM multi-agent debate and reliability evaluation models. Unlike traditional methods, our model achieves significant improvements in efficiency by minimizing the need for large historical datasets, thus offering a lightweight yet robust solution. Comparative experiments show that it outperforms several general-purpose and domain-specific legal models, offering a dynamic reasoning process and a promising direction for future LegalAI research.
Related papers
- Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability [16.441081996257576]
We propose leveraging reasoning-intensive models to improve less computationally demanding, non-reasoning models.
We demonstrate consistent improvements across various benchmarks, underscoring the potential of this approach for advancing the ability of models to answer questions directly.
arXiv Detail & Related papers (2025-04-13T16:26:56Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance.<n>Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws.<n>We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - A Survey of Direct Preference Optimization [103.59317151002693]
Large Language Models (LLMs) have demonstrated unprecedented generative capabilities.<n>Their alignment with human values remains critical for ensuring helpful and harmless deployments.<n>Direct Preference Optimization (DPO) has recently gained prominence as a streamlined alternative.
arXiv Detail & Related papers (2025-03-12T08:45:15Z) - Adversarial Multi-Agent Evaluation of Large Language Models through Iterative Debates [0.0]
We propose a framework that interprets large language models (LLMs) as advocates within an ensemble of interacting agents.
This approach offers a more dynamic and comprehensive evaluation process compared to traditional human-based assessments or automated metrics.
arXiv Detail & Related papers (2024-10-07T00:22:07Z) - Multi-Reference Preference Optimization for Large Language Models [56.84730239046117]
We introduce a novel closed-form formulation for direct preference optimization using multiple reference models.
The resulting algorithm, Multi-Reference Preference Optimization (MRPO), leverages broader prior knowledge from diverse reference models.
Our experiments demonstrate that LLMs finetuned with MRPO generalize better in various preference data, regardless of data scarcity or abundance.
arXiv Detail & Related papers (2024-05-26T00:29:04Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Benchmarking Sequential Visual Input Reasoning and Prediction in
Multimodal Large Language Models [21.438427686724932]
We introduce a novel benchmark that assesses the predictive reasoning capabilities of MLLMs across diverse scenarios.
Our benchmark targets three important domains: abstract pattern reasoning, human activity prediction, and physical interaction prediction.
Empirical experiments confirm the soundness of the proposed benchmark and evaluation methods.
arXiv Detail & Related papers (2023-10-20T13:14:38Z) - Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters [33.401355417911084]
This study is on leveraging the knowledge of large pretrained models to improve handling of OOD scenarios and tackle domain generalization problems.<n>We employ parameter-efficient fine-tuning (PEFT) techniques to effectively preserve OOD robustness while working with large models.<n>Our experiments and analysis confirm that the most effective approaches involve ensembling diverse models and increasing the scale of pretraining.
arXiv Detail & Related papers (2023-10-17T07:01:24Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.