Related papers: AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions

AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions

URL: http://arxiv.org/abs/2506.01671v1
Date: Mon, 02 Jun 2025 13:40:59 GMT
Title: AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions
Authors: Adriana Eufrosina Bora, Akshatha Arodi, Duoyi Zhang, Jordan Bannister, Mirko Bronzi, Arsene Fansi Tchango, Md Abul Bashar, Richi Nayak, Kerrie Mengersen,
Abstract summary: We present AIMS.uk and AIMS.ca, newly annotated datasets from the UK and Canada.<n>Second, we introduce AIMSCheck, an end-to-end framework for compliance validation.<n>Our experiments show that models trained on an Australian dataset generalize well across UK and Canadian jurisdictions.
Score: 1.3858903828439308
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern Slavery Acts mandate that corporations disclose their efforts to combat modern slavery, aiming to enhance transparency and strengthen practices for its eradication. However, verifying these statements remains challenging due to their complex, diversified language and the sheer number of statements that must be reviewed. The development of NLP tools to assist in this task is also difficult due to a scarcity of annotated data. Furthermore, as modern slavery transparency legislation has been introduced in several countries, the generalizability of such tools across legal jurisdictions must be studied. To address these challenges, we work with domain experts to make two key contributions. First, we present AIMS.uk and AIMS.ca, newly annotated datasets from the UK and Canada to enable cross-jurisdictional evaluation. Second, we introduce AIMSCheck, an end-to-end framework for compliance validation. AIMSCheck decomposes the compliance assessment task into three levels, enhancing interpretability and practical applicability. Our experiments show that models trained on an Australian dataset generalize well across UK and Canadian jurisdictions, demonstrating the potential for broader application in compliance monitoring. We release the benchmark datasets and AIMSCheck to the public to advance AI-adoption in compliance assessment and drive further research in this field.

Related papers

Verifying International Agreements on AI: Six Layers of Verification for Rules on Large-Scale AI Development and Deployment [0.7364983833280243]
This report provides an in-depth overview of AI verification, intended for both policy professionals and technical researchers.<n>We present novel conceptual frameworks, detailed implementation options, and key R&D challenges.<n>We find that states could eventually verify compliance by using six largely independent verification approaches.
arXiv Detail & Related papers (2025-07-21T17:45:15Z)
The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
Tasks and Roles in Legal AI: Data Curation, Annotation, and Verification [4.099848175176399]
The application of AI tools to the legal field feels natural.<n>However, legal documents differ from the web-based text that underlies most AI systems.<n>We identify three areas of special relevance to practitioners: data curation, data annotation, and output verification.
arXiv Detail & Related papers (2025-04-02T04:34:58Z)
AgentOrca: A Dual-System Framework to Evaluate Language Agents on Operational Routine and Constraint Adherence [54.317522790545304]
We present AgentOrca, a dual-system framework for evaluating language agents' compliance with operational constraints and routines.<n>Our framework encodes action constraints and routines through both natural language prompts for agents and corresponding executable code serving as ground truth for automated verification.<n>Our findings reveal notable performance gaps among state-of-the-art models, with large reasoning models like o1 demonstrating superior compliance while others show significantly lower performance.
arXiv Detail & Related papers (2025-03-11T17:53:02Z)
A Law Reasoning Benchmark for LLM with Tree-Organized Structures including Factum Probandum, Evidence and Experiences [76.73731245899454]
We propose a transparent law reasoning schema enriched with hierarchical factum probandum, evidence, and implicit experience.<n>Inspired by this schema, we introduce the challenging task, which takes a textual case description and outputs a hierarchical structure justifying the final decision.<n>This benchmark paves the way for transparent and accountable AI-assisted law reasoning in the Intelligent Court''
arXiv Detail & Related papers (2025-03-02T10:26:54Z)
AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction [56.797874973414636]
AnnoCaseLaw is a first-of-its-kind dataset of 471 meticulously annotated U.S. Appeals Court negligence cases.<n>Our dataset lays the groundwork for more human-aligned, explainable Legal Judgment Prediction models.<n>Results demonstrate that LJP remains a formidable task, with application of legal precedent proving particularly difficult.
arXiv Detail & Related papers (2025-02-28T19:14:48Z)
AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements [3.0847285957317654]
We introduce a dataset composed of 5,731 modern slavery statements taken from the Australian Modern Slavery Register and annotated at the sentence level.<n>We propose a machine learning methodology for the detection of sentences relevant to mandatory reporting requirements set by the Australian Modern Slavery Act.
arXiv Detail & Related papers (2025-02-10T20:30:32Z)
The explanation dialogues: an expert focus study to understand requirements towards explanations within the GDPR [47.06917254695738]
We present the Explanation Dialogues, an expert focus study to uncover the expectations, reasoning, and understanding of legal experts and practitioners towards XAI.<n>The study consists of an online questionnaire and follow-up interviews, and is centered around a use-case in the credit domain.<n>We find that the presented explanations are hard to understand and lack information, and discuss issues that can arise from the different interests of the data controller and subject.
arXiv Detail & Related papers (2025-01-09T15:50:02Z)
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act [40.233017376716305]
The EU's Artificial Intelligence Act (AI Act) is a significant step towards responsible AI development.<n>It lacks clear technical interpretation, making it difficult to assess models' compliance.<n>This work presents COMPL-AI, a comprehensive framework consisting of the first technical interpretation of the Act.
arXiv Detail & Related papers (2024-10-10T14:23:51Z)
A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law [65.87885628115946]
Large language models (LLMs) are revolutionizing the landscapes of finance, healthcare, and law. We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies. We critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems.
arXiv Detail & Related papers (2024-05-02T22:43:02Z)
INACIA: Integrating Large Language Models in Brazilian Audit Courts: Opportunities and Challenges [7.366861473623427]
INACIA is a groundbreaking system designed to integrate Large Language Models (LLMs) into the operational framework of Brazilian Federal Court of Accounts (TCU) We demonstrate INACIA's potential in extracting relevant information from case documents, evaluating its legal plausibility, and formulating propositions for judicial decision-making.
arXiv Detail & Related papers (2024-01-10T17:13:28Z)
An Uncommon Task: Participatory Design in Legal AI [64.54460979588075]
We examine a notable yet understudied AI design process in the legal domain that took place over a decade ago. We show how an interactive simulation methodology allowed computer scientists and lawyers to become co-designers.
arXiv Detail & Related papers (2022-03-08T15:46:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.