Smart Audit System Empowered by LLM
- URL: http://arxiv.org/abs/2410.07677v1
- Date: Thu, 10 Oct 2024 07:36:15 GMT
- Title: Smart Audit System Empowered by LLM
- Authors: Xu Yao, Xiaoxu Wu, Xi Li, Huan Xu, Chenlei Li, Ping Huang, Si Li, Xiaoning Ma, Jiulong Shan,
- Abstract summary: We propose a smart audit system empowered by large language models (LLMs)
Our approach introduces three innovations: a dynamic risk assessment model that streamlines audit procedures; a manufacturing compliance copilot that enhances data processing, retrieval, and evaluation; and a Re-act framework commonality analysis agent that provides real-time, customized analysis.
These enhancements elevate audit efficiency and effectiveness, with testing scenarios demonstrating an improvement of over 24%.
- Score: 25.2545519709246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Manufacturing quality audits are pivotal for ensuring high product standards in mass production environments. Traditional auditing processes, however, are labor-intensive and reliant on human expertise, posing challenges in maintaining transparency, accountability, and continuous improvement across complex global supply chains. To address these challenges, we propose a smart audit system empowered by large language models (LLMs). Our approach introduces three innovations: a dynamic risk assessment model that streamlines audit procedures and optimizes resource allocation; a manufacturing compliance copilot that enhances data processing, retrieval, and evaluation for a self-evolving manufacturing knowledge base; and a Re-act framework commonality analysis agent that provides real-time, customized analysis to empower engineers with insights for supplier improvement. These enhancements elevate audit efficiency and effectiveness, with testing scenarios demonstrating an improvement of over 24%.
Related papers
- The Dual-use Dilemma in LLMs: Do Empowering Ethical Capacities Make a Degraded Utility? [54.18519360412294]
Large Language Models (LLMs) must balance between rejecting harmful requests for safety and accommodating legitimate ones for utility.
This paper presents a Direct Preference Optimization (DPO) based alignment framework that achieves better overall performance.
Our resulting model, LibraChem, outperforms leading LLMs including Claude-3, GPT-4o, and LLaMA-3 by margins of 13.44%, 7.16%, and 7.10% respectively.
arXiv Detail & Related papers (2025-01-20T06:35:01Z) - Addressing Quality Challenges in Deep Learning: The Role of MLOps and Domain Knowledge [5.190998244098203]
Deep learning (DL) systems present unique challenges in software engineering, especially concerning quality attributes like correctness and resource efficiency.
This experience paper explores the role of MLOps practices in creating transparent and reproducible experimentation environments.
We report on experiences addressing the quality challenges by embedding domain knowledge into the design of a DL model and its integration within a larger system.
arXiv Detail & Related papers (2025-01-14T19:37:08Z) - The Lessons of Developing Process Reward Models in Mathematical Reasoning [62.165534879284735]
Process Reward Models (PRMs) aim to identify and mitigate intermediate errors in the reasoning processes.
We develop a consensus filtering mechanism that effectively integrates Monte Carlo (MC) estimation with Large Language Models (LLMs)
We release a new state-of-the-art PRM that outperforms existing open-source alternatives.
arXiv Detail & Related papers (2025-01-13T13:10:16Z) - Powering LLM Regulation through Data: Bridging the Gap from Compute Thresholds to Customer Experiences [0.0]
This paper argues that current regulatory approaches, which focus on compute-level thresholds and generalized model evaluations, are insufficient to ensure the safety and effectiveness of specific LLM-based user experiences.
We propose a shift towards a certification process centered on actual user-facing experiences and the curation of high-quality datasets for evaluation.
arXiv Detail & Related papers (2025-01-12T16:20:40Z) - On the Adversarial Robustness of Instruction-Tuned Large Language Models for Code [4.286327408435937]
We assess the impact of diverse input challenges on the functionality and correctness of generated code using rigorous metrics and established benchmarks.
Open-source models demonstrate an increased susceptibility to input perturbations, resulting in declines in functional correctness ranging from 12% to 34%.
In contrast, commercial models demonstrate relatively greater resilience, with performance degradation ranging from 3% to 24%.
arXiv Detail & Related papers (2024-11-29T07:00:47Z) - Large Language Models for Manufacturing [41.12098478080648]
Large Language Models (LLMs) have the potential to transform manufacturing industry, offering new opportunities to optimize processes, improve efficiency, and drive innovation.
This paper focuses on the integration of LLMs into the manufacturing domain, focusing on their potential to automate and enhance various aspects of manufacturing.
arXiv Detail & Related papers (2024-10-28T18:13:47Z) - Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs)
We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents [74.16170899755281]
We introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents.
AgentBoard offers a fine-grained progress rate metric that captures incremental advancements as well as a comprehensive evaluation toolkit.
This not only sheds light on the capabilities and limitations of LLM agents but also propels the interpretability of their performance to the forefront.
arXiv Detail & Related papers (2024-01-24T01:51:00Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Trustworthy Artificial Intelligence and Process Mining: Challenges and
Opportunities [0.8602553195689513]
We show that process mining can provide a useful framework for gaining fact-based visibility to AI compliance process execution.
We provide for an automated approach to analyze, remediate and monitor uncertainty in AI regulatory compliance processes.
arXiv Detail & Related papers (2021-10-06T12:50:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.