Refining Financial Consumer Complaints through Multi-Scale Model   Interaction
        - URL: http://arxiv.org/abs/2504.09903v1
 - Date: Mon, 14 Apr 2025 05:51:31 GMT
 - Title: Refining Financial Consumer Complaints through Multi-Scale Model   Interaction
 - Authors: Bo-Wei Chen, An-Zi Yen, Chung-Chi Chen, 
 - Abstract summary: This paper explores the task of legal text refinement that transforms informal, conversational inputs into persuasive legal arguments.<n>We introduce FinDR, a Chinese dataset of financial dispute records, annotated with official judgments on claim reasonableness.<n> Experimental results demonstrate that Multi-Scale Model Interaction (MSMI) significantly outperforms single-pass prompting strategies.
 - Score: 8.504311452987036
 - License: http://creativecommons.org/licenses/by-nc-nd/4.0/
 - Abstract:   Legal writing demands clarity, formality, and domain-specific precision-qualities often lacking in documents authored by individuals without legal training. To bridge this gap, this paper explores the task of legal text refinement that transforms informal, conversational inputs into persuasive legal arguments. We introduce FinDR, a Chinese dataset of financial dispute records, annotated with official judgments on claim reasonableness. Our proposed method, Multi-Scale Model Interaction (MSMI), leverages a lightweight classifier to evaluate outputs and guide iterative refinement by Large Language Models (LLMs). Experimental results demonstrate that MSMI significantly outperforms single-pass prompting strategies. Additionally, we validate the generalizability of MSMI on several short-text benchmarks, showing improved adversarial robustness. Our findings reveal the potential of multi-model collaboration for enhancing legal document generation and broader text refinement tasks. 
 
       
      
        Related papers
        - Improving the Accuracy and Efficiency of Legal Document Tagging with   Large Language Models and Instruction Prompts [0.6554326244334866]
Legal-LLM is a novel approach that leverages the instruction-following capabilities of Large Language Models (LLMs) through fine-tuning.<n>We evaluate our method on two benchmark datasets, POSTURE50K and EURLEX57K, using micro-F1 and macro-F1 scores.
arXiv  Detail & Related papers  (2025-04-12T18:57:04Z) - Aplicação de Large Language Models na Análise e Síntese de   Documentos Jurídicos: Uma Revisão de Literatura [0.0]
Large Language Models (LLMs) have been increasingly used to optimize the analysis and synthesis of legal documents.<n>This study aims to conduct a systematic literature review to identify the state of the art in prompt engineering applied to LLMs in the legal context.
arXiv  Detail & Related papers  (2025-04-01T12:34:00Z) - TaMPERing with Large Language Models: A Field Guide for using Generative   AI in Public Administration Research [0.0]
The integration of Large Language Models (LLMs) into social science research presents transformative opportunities for advancing scientific inquiry.<n>This manuscript introduces the TaMPER framework-a structured methodology organized around five critical decision points: Task, Model, Prompt, Evaluation, and Reporting.
arXiv  Detail & Related papers  (2025-03-30T21:38:11Z) - A Survey on Mechanistic Interpretability for Multi-Modal Foundation   Models [74.48084001058672]
The rise of foundation models has transformed machine learning research.
 multimodal foundation models (MMFMs) pose unique interpretability challenges beyond unimodal frameworks.
This survey explores two key aspects: (1) the adaptation of LLM interpretability methods to multimodal models and (2) understanding the mechanistic differences between unimodal language models and crossmodal systems.
arXiv  Detail & Related papers  (2025-02-22T20:55:26Z) - Multi-Agent Simulator Drives Language Models for Legal Intensive   Interaction [37.856194200684364]
This paper introduces a Multi-agent Legal Simulation Driver (MASER) to scalably generate synthetic data by simulating interactive legal scenarios.<n>MASER ensures the consistency of legal attributes between participants and introduces a supervisory mechanism to align participants' characters and behaviors.
arXiv  Detail & Related papers  (2025-02-08T15:05:24Z) - Calling a Spade a Heart: Gaslighting Multimodal Large Language Models   via Negation [65.92001420372007]
This paper systematically evaluates state-of-the-art MLLMs across diverse benchmarks.<n>We introduce the first benchmark GaslightingBench, specifically designed to evaluate the vulnerability of MLLMs to negation arguments.
arXiv  Detail & Related papers  (2025-01-31T10:37:48Z) - Chain-of-Thought Prompting for Demographic Inference with Large   Multimodal Models [58.58594658683919]
Large multimodal models (LMMs) have shown transformative potential across various research tasks.
Our findings indicate LMMs possess advantages in zero-shot learning, interpretability, and handling uncurated 'in-the-wild' inputs.
We propose a Chain-of-Thought augmented prompting approach, which effectively mitigates the off-target prediction issue.
arXiv  Detail & Related papers  (2024-05-24T16:26:56Z) - Leveraging Large Language Models for Relevance Judgments in Legal Case   Retrieval [16.29803062332164]
We propose a few-shot approach where large language models assist in generating expert-aligned relevance judgments.<n>The proposed approach decomposes the judgment process into several stages, mimicking the workflow of human annotators.<n>It also ensures interpretable data labeling, providing transparency and clarity in the relevance assessment process.
arXiv  Detail & Related papers  (2024-03-27T09:46:56Z) - FENICE: Factuality Evaluation of summarization based on Natural language   Inference and Claim Extraction [85.26780391682894]
We propose Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction (FENICE)
FENICE leverages an NLI-based alignment between information in the source document and a set of atomic facts, referred to as claims, extracted from the summary.
Our metric sets a new state of the art on AGGREFACT, the de-facto benchmark for factuality evaluation.
arXiv  Detail & Related papers  (2024-03-04T17:57:18Z) - Exploring Precision and Recall to assess the quality and diversity of   LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv  Detail & Related papers  (2024-02-16T13:53:26Z) - Can LLMs Produce Faithful Explanations For Fact-checking? Towards
  Faithful Explainable Fact-Checking via Multi-Agent Debate [75.10515686215177]
Large Language Models (LLMs) excel in text generation, but their capability for producing faithful explanations in fact-checking remains underexamined.
We propose the Multi-Agent Debate Refinement (MADR) framework, leveraging multiple LLMs as agents with diverse roles.
MADR ensures that the final explanation undergoes rigorous validation, significantly reducing the likelihood of unfaithful elements and aligning closely with the provided evidence.
arXiv  Detail & Related papers  (2024-02-12T04:32:33Z) - Enhancing Pre-Trained Language Models with Sentence Position Embeddings
  for Rhetorical Roles Recognition in Legal Opinions [0.16385815610837165]
The size of legal opinions continues to grow, making it increasingly challenging to develop a model that can accurately predict the rhetorical roles of legal opinions.
We propose a novel model architecture for automatically predicting rhetorical roles using pre-trained language models (PLMs) enhanced with knowledge of sentence position information.
Based on an annotated corpus from the LegalEval@SemEval2023 competition, we demonstrate that our approach requires fewer parameters, resulting in lower computational costs.
arXiv  Detail & Related papers  (2023-10-08T20:33:55Z) - Improving Factuality and Reasoning in Language Models through Multiagent
  Debate [95.10641301155232]
We present a complementary approach to improve language responses where multiple language model instances propose and debate their individual responses and reasoning processes over multiple rounds to arrive at a common final answer.
Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.
Our approach may be directly applied to existing black-box models and uses identical procedure and prompts for all tasks we investigate.
arXiv  Detail & Related papers  (2023-05-23T17:55:11Z) - Evaluating and Improving Factuality in Multimodal Abstractive
  Summarization [91.46015013816083]
We propose CLIPBERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary.
We show that this simple combination of two metrics in the zero-shot achieves higher correlations than existing factuality metrics for document summarization.
Our analysis demonstrates the robustness and high correlation of CLIPBERTScore and its components on four factuality metric-evaluation benchmarks.
arXiv  Detail & Related papers  (2022-11-04T16:50:40Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.