Adaptable and Precise: Enterprise-Scenario LLM Function-Calling Capability Training Pipeline
- URL: http://arxiv.org/abs/2412.15660v1
- Date: Fri, 20 Dec 2024 08:20:21 GMT
- Title: Adaptable and Precise: Enterprise-Scenario LLM Function-Calling Capability Training Pipeline
- Authors: Guancheng Zeng, Wentao Ding, Beining Xu, Chi Zhang, Wenqiang Han, Gang Li, Jingjing Mo, Pengxu Qiu, Xinran Tao, Wang Tao, Haowen Hu,
- Abstract summary: We propose a training pipeline for function-calling capabilities tailored to real-world business scenarios.
This pipeline includes the synthesis and augmentation of scenario-specific function-calling data, model fine-tuning, and performance evaluation and analysis.
Our fine-tuned model demonstrated outstanding performance in evaluations and practical applications, surpassing GPT-4 and GPT-4o in accuracy on the test set.
- Score: 7.487352346469893
- License:
- Abstract: Enterprises possess a vast array of API assets scattered across various functions, forming the backbone of existing business processes. By leveraging these APIs as functional tools, enterprises can design diverse, scenario-specific agent applications, driven by on-premise function-calling models as the core engine. However, generic models often fail to meet enterprise requirements in terms of computational efficiency, output accuracy, and stability, necessitating scenario-specific adaptation. In this paper, we propose a training pipeline for function-calling capabilities tailored to real-world business scenarios. This pipeline includes the synthesis and augmentation of scenario-specific function-calling data, model fine-tuning, and performance evaluation and analysis. Using this pipeline, we generated 1,260 fully AI-generated samples and 1,035 augmented manually-labeled samples in digital HR agent scenario. The Qwen2.5-Coder-7B-Instruct model was employed as the base model and fine-tuned using the LoRA method on four GPUs with 24GB VRAM. Our fine-tuned model demonstrated outstanding performance in evaluations and practical applications, surpassing GPT-4 and GPT-4o in accuracy on the test set. These results validate the reliability of the proposed pipeline for training scenario-specific function-calling models.
Related papers
- SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow.
SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns.
We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z) - ToolACE: Winning the Points of LLM Function Calling [139.07157814653638]
ToolACE is an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data.
We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard.
arXiv Detail & Related papers (2024-09-02T03:19:56Z) - APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets [99.8988504388011]
APIGen is an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications.
We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets.
We release a dataset containing 60,000 high-quality entries, aiming to advance the field of function-calling agent domains.
arXiv Detail & Related papers (2024-06-26T17:49:11Z) - Model-driven realization of IDTA submodel specifications: The good, the bad, the incompatible? [49.60138105915326]
Asset Administration Shells are trending in Industry 4.0.
In February 2024, the Industrial Digital Twin Association announced 84 and released 18 AAS submodel specifications.
We present a model-driven approach, which transforms extracted information from IDTA specifications into an intermediary meta-model and, from there, generates API code and tests.
arXiv Detail & Related papers (2024-06-20T16:33:46Z) - Event Stream GPT: A Data Pre-processing and Modeling Library for
Generative, Pre-trained Transformers over Continuous-time Sequences of
Complex Events [2.9330609943398525]
Event Stream GPT (ESGPT) is an open-source library designed to streamline the end-to-end process for building GPTs for continuous-time event sequences.
ESGPT allows users to build flexible, foundation-model scale input datasets by specifying only a minimal configuration file.
arXiv Detail & Related papers (2023-06-20T14:01:29Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - PipeSim: Trace-driven Simulation of Large-Scale AI Operations Platforms [4.060731229044571]
We present a trace-driven simulation-based experimentation and analytics environment for large-scale AI systems.
Analytics data from a production-grade AI platform developed at IBM are used to build a comprehensive simulation model.
We implement the model in a standalone, discrete event simulator, and provide a toolkit for running experiments.
arXiv Detail & Related papers (2020-06-22T19:55:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.