Agents on the Bench: Large Language Model Based Multi Agent Framework for Trustworthy Digital Justice
- URL: http://arxiv.org/abs/2412.18697v1
- Date: Tue, 24 Dec 2024 23:13:37 GMT
- Title: Agents on the Bench: Large Language Model Based Multi Agent Framework for Trustworthy Digital Justice
- Authors: Cong Jiang, Xiaolei Yang,
- Abstract summary: We propose a large language model based multi-agent framework named AgentsBench.
Our approach leverages multiple LLM-driven agents that simulate the collaborative deliberation and decision making process of a judicial bench.
Our framework reflects real-world judicial processes more closely, enhancing accuracy, fairness, and society consideration.
- Score: 0.5217870815854703
- License:
- Abstract: The justice system has increasingly employed AI techniques to enhance efficiency, yet limitations remain in improving the quality of decision-making, particularly regarding transparency and explainability needed to uphold public trust in legal AI. To address these challenges, we propose a large language model based multi-agent framework named AgentsBench, which aims to simultaneously improve both efficiency and quality in judicial decision-making. Our approach leverages multiple LLM-driven agents that simulate the collaborative deliberation and decision making process of a judicial bench. We conducted experiments on legal judgment prediction task, and the results show that our framework outperforms existing LLM based methods in terms of performance and decision quality. By incorporating these elements, our framework reflects real-world judicial processes more closely, enhancing accuracy, fairness, and society consideration. AgentsBench provides a more nuanced and realistic methods of trustworthy AI decision-making, with strong potential for application across various case types and legal scenarios.
Related papers
- Fairness in Multi-Agent AI: A Unified Framework for Ethical and Equitable Autonomous Systems [0.0]
This paper introduces a novel framework where fairness is treated as a dynamic, emergent property of agent interactions.
The framework integrates fairness constraints, bias mitigation strategies, and incentive mechanisms to align autonomous agent behaviors with societal values.
arXiv Detail & Related papers (2025-02-11T04:42:00Z) - LegalAgentBench: Evaluating LLM Agents in Legal Domain [53.70993264644004]
LegalAgentBench is a benchmark specifically designed to evaluate LLM Agents in the Chinese legal domain.
LegalAgentBench includes 17 corpora from real-world legal scenarios and provides 37 tools for interacting with external knowledge.
arXiv Detail & Related papers (2024-12-23T04:02:46Z) - Peer-induced Fairness: A Causal Approach for Algorithmic Fairness Auditing [0.0]
The European Union's Artificial Intelligence Act takes effect on 1 August 2024.
High-risk AI applications must adhere to stringent transparency and fairness standards.
We propose a novel framework, which combines the strengths of counterfactual fairness and peer comparison strategy.
arXiv Detail & Related papers (2024-08-05T15:35:34Z) - TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs [50.259001311894295]
We propose a novel TRansformer-based Attribution framework using Contrastive Embeddings called TRACE.
We show that TRACE significantly improves the ability to attribute sources accurately, making it a valuable tool for enhancing the reliability and trustworthiness of large language models.
arXiv Detail & Related papers (2024-07-06T07:19:30Z) - Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction [23.046342240176575]
We introduce the Ask-Discriminate-Predict (ADAPT) reasoning framework inspired by human reasoning.
ADAPT involves decomposing case facts, discriminating among potential charges, and predicting the final judgment.
Experiments conducted on two widely-used datasets demonstrate the superior performance of our framework in legal judgment prediction.
arXiv Detail & Related papers (2024-07-02T05:43:15Z) - AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation [19.733007669738008]
We propose a novel multi-agent framework, AgentsCourt, for judicial decision-making.
Our framework follows the classic court trial process, consisting of court debate simulation, legal resources retrieval and decision-making refinement.
To support this task, we construct a large-scale legal knowledge base, Legal-KB, with multi-resource legal knowledge.
arXiv Detail & Related papers (2024-03-05T13:30:02Z) - Large Multimodal Agents: A Survey [78.81459893884737]
Large language models (LLMs) have achieved superior performance in powering text-based AI agents.
There is an emerging research trend focused on extending these LLM-powered AI agents into the multimodal domain.
This review aims to provide valuable insights and guidelines for future research in this rapidly evolving field.
arXiv Detail & Related papers (2024-02-23T06:04:23Z) - AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents [74.16170899755281]
We introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents.
AgentBoard offers a fine-grained progress rate metric that captures incremental advancements as well as a comprehensive evaluation toolkit.
This not only sheds light on the capabilities and limitations of LLM agents but also propels the interpretability of their performance to the forefront.
arXiv Detail & Related papers (2024-01-24T01:51:00Z) - Rational Decision-Making Agent with Internalized Utility Judgment [91.80700126895927]
Large language models (LLMs) have demonstrated remarkable advancements and have attracted significant efforts to develop LLMs into agents capable of executing intricate multi-step decision-making tasks beyond traditional NLP applications.
This paper proposes RadAgent, which fosters the development of its rationality through an iterative framework involving Experience Exploration and Utility Learning.
Experimental results on the ToolBench dataset demonstrate RadAgent's superiority over baselines, achieving over 10% improvement in Pass Rate on diverse tasks.
arXiv Detail & Related papers (2023-08-24T03:11:45Z) - ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate [57.71597869337909]
We build a multi-agent referee team called ChatEval to autonomously discuss and evaluate the quality of generated responses from different models.
Our analysis shows that ChatEval transcends mere textual scoring, offering a human-mimicking evaluation process for reliable assessments.
arXiv Detail & Related papers (2023-08-14T15:13:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.