Resource-Efficient LLM Application for Structured Transformation of Unstructured Financial Contracts
- URL: http://arxiv.org/abs/2510.23990v1
- Date: Tue, 28 Oct 2025 01:49:10 GMT
- Title: Resource-Efficient LLM Application for Structured Transformation of Unstructured Financial Contracts
- Authors: Maruf Ahmed Mridul, Oshani Seneviratne,
- Abstract summary: We present an extension of the CDMizer framework for converting legal documents into machine-readable formats.<n>We compare its performance with a benchmark developed by the International Swaps and Derivatives Association.<n>This work underscores the potential of resource-efficient solutions to automate legal contract transformation.
- Score: 1.1565257196553245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The transformation of unstructured legal contracts into standardized, machine-readable formats is essential for automating financial workflows. The Common Domain Model (CDM) provides a standardized framework for this purpose, but converting complex legal documents like Credit Support Annexes (CSAs) into CDM representations remains a significant challenge. In this paper, we present an extension of the CDMizer framework, a template-driven solution that ensures syntactic correctness and adherence to the CDM schema during contract-to-CDM conversion. We apply this extended framework to a real-world task, comparing its performance with a benchmark developed by the International Swaps and Derivatives Association (ISDA) for CSA clause extraction. Our results show that CDMizer, when integrated with a significantly smaller, open-source Large Language Model (LLM), achieves competitive performance in terms of accuracy and efficiency against larger, proprietary models. This work underscores the potential of resource-efficient solutions to automate legal contract transformation, offering a cost-effective and scalable approach that can meet the needs of financial institutions with constrained resources or strict data privacy requirements.
Related papers
- Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition [51.68340973140949]
Multimodal Named Entity Recognition (GMNER) aims to extract text-based entities, assign them semantic categories, and ground them to corresponding visual regions.<n> MLLMs exhibit $textbfmodality bias$, including visual bias and textual bias, which stems from their tendency to take unimodal shortcuts.<n>We propose Modality-aware Consistency Reasoning ($bfMCR$), which enforces structured cross-modal reasoning.
arXiv Detail & Related papers (2026-02-04T12:12:49Z) - From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning [81.97788535387286]
We propose a framework that internalizes the agentic verification-and-editing mechanism into a unified, single-pass inference process.<n>With minimal data, SRI-Coder enables Chat models to surpass the completion performance of their Base counterparts.<n>Unlike FIM-style tuning, SRI preserves general coding competencies and maintains inference latency comparable to standard FIM.
arXiv Detail & Related papers (2026-01-19T20:33:53Z) - DeepRule: An Integrated Framework for Automated Business Rule Generation via Deep Predictive Modeling and Hybrid Search Optimization [12.68443002994035]
DeepRule is an integrated framework for automated business rule generation in retail assortment and pricing optimization.<n>We design a hybrid knowledge fusion engine employing large language models (LLMs) for deep semantic parsing of unstructured text.<n>We validate the framework in real retail environments achieving higher profits versus systematic B2C baselines while ensuring operational feasibility.
arXiv Detail & Related papers (2025-12-03T09:40:33Z) - Adapformer: Adaptive Channel Management for Multivariate Time Series Forecasting [49.40321003932633]
Adapformer is an advanced Transformer-based framework that merges the benefits of CI and CD methodologies through effective channel management.<n>Adapformer achieves superior performance over existing models, enhancing both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-11-18T16:24:05Z) - Automatic Building Code Review: A Case Study [6.530899637501737]
Building officials face labor-intensive, error-prone, and costly manual reviews of design documents as projects increase in size and complexity.<n>This study introduces a novel agent-driven framework that integrates BIM-based data extraction with automated verification.
arXiv Detail & Related papers (2025-10-03T00:30:14Z) - Agentic AI for Financial Crime Compliance [0.0]
This paper presents the design and deployment of an agentic AI system for financial crime compliance (FCC) in digitally native financial platforms.<n>The contribution includes a reference architecture, a real-world prototype, and insights into how Agentic AI can reconfigure under regulatory constraints.
arXiv Detail & Related papers (2025-09-16T14:53:51Z) - Structured Agentic Workflows for Financial Time-Series Modeling with LLMs and Reflective Feedback [16.04516547661581]
Time-series data is central to decision-making in financial markets, yet building high-performing, interpretable, and auditable models remains a major challenge.<n>textsfTSAgent is a modular agentic framework designed to automate and enhance time-series modeling for financial applications.
arXiv Detail & Related papers (2025-08-19T15:14:49Z) - Federated In-Context Learning: Iterative Refinement for Improved Answer Quality [62.72381208029899]
In-context learning (ICL) enables language models to generate responses without modifying their parameters by leveraging examples provided in the input.<n>We propose Federated In-Context Learning (Fed-ICL), a general framework that enhances ICL through an iterative, collaborative process.<n>Fed-ICL progressively refines responses by leveraging multi-round interactions between clients and a central server, improving answer quality without the need to transmit model parameters.
arXiv Detail & Related papers (2025-06-09T05:33:28Z) - COALESCE: Economic and Security Dynamics of Skill-Based Task Outsourcing Among Team of Autonomous LLM Agents [0.0]
COALESCE is a novel framework designed to enable autonomous Large Language Model (LLM) agents to outsource specific subtasks to specialized, cost-effective third-party LLM agents.<n> Comprehensive validation through 239 theoretical simulations demonstrates 41.8% cost reduction potential.<n>Large-scale empirical validation across 240 real LLM tasks confirms 20.3% cost reduction with proper epsilon-greedy exploration.
arXiv Detail & Related papers (2025-06-02T17:22:47Z) - AI4Contracts: LLM & RAG-Powered Encoding of Financial Derivative Contracts [1.3060230641655135]
Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) are reshaping how AI systems extract and organize information from unstructured text.<n>We introduce CDMizer, a template-driven, LLM, and RAG-based framework for structured text transformation.
arXiv Detail & Related papers (2025-06-01T16:05:00Z) - MSDA: Combining Pseudo-labeling and Self-Supervision for Unsupervised Domain Adaptation in ASR [59.83547898874152]
We introduce a sample-efficient, two-stage adaptation approach that integrates self-supervised learning with semi-supervised techniques.<n>MSDA is designed to enhance the robustness and generalization of ASR models.<n>We demonstrate that Meta PL can be applied effectively to ASR tasks, achieving state-of-the-art results.
arXiv Detail & Related papers (2025-05-30T14:46:05Z) - Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models [48.15777554876988]
Traditional alignment methods often require retraining large pretrained models.<n>We propose a novel textitResidual Alignment Model (textitRAM) that formalizes the alignment process as a type of importance sampling.<n>We develop a resampling algorithm with iterative token-level decoding to address the common first-token latency issue in comparable methods.
arXiv Detail & Related papers (2025-05-26T08:53:02Z) - MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision [76.42361936804313]
We introduce MAS-ZERO, the first self-evolved, inference-time framework for automatic MAS design.<n> MAS-ZERO employs meta-level design to iteratively generate, evaluate, and refine MAS configurations tailored to each problem instance.
arXiv Detail & Related papers (2025-05-21T00:56:09Z) - Diffusion & Adversarial Schrödinger Bridges via Iterative Proportional Markovian Fitting [87.37278888311839]
Iterative Markovian Fitting (IMF) procedure successfully solves the Schr"odinger Bridge (SB) problem.<n>We show a close connection between IMF and the Iterative Proportional Fitting (IPF) procedure.<n>We refer to this combined approach as the Iterative Proportional Markovian Fitting (IPMF) procedure.
arXiv Detail & Related papers (2024-10-03T15:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.