Related papers: Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation

Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation

URL: http://arxiv.org/abs/2412.01130v2
Date: Wed, 04 Dec 2024 03:34:42 GMT
Title: Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation
Authors: Yi-Chang Chen, Po-Chun Hsu, Chan-Jan Hsu, Da-shan Shiu,
Abstract summary: Large language models (LLMs) have significantly advanced autonomous agents, particularly in function calling.<n>This research delves into enhancing the function-calling capabilities of LLMs by exploring different approaches.
Score: 15.259077785780667
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have significantly advanced autonomous agents, particularly in zero-shot tool usage, also known as function calling. This research delves into enhancing the function-calling capabilities of LLMs by exploring different approaches, including prompt formats for integrating function descriptions, blending function-calling and instruction-following data, introducing a novel Decision Token for conditional prompts, leveraging chain-of-thought reasoning, and overcoming multilingual challenges with a translation pipeline. Our key findings and contributions are as follows: (1) Instruction-following data improves both function-calling accuracy and relevance detection. (2) The use of the newly proposed Decision Token, combined with synthetic non-function-call data, enhances relevance detection. (3) A tailored translation pipeline effectively overcomes multilingual limitations, demonstrating significant improvements in Traditional Chinese. These insights highlight the potential for improved function-calling capabilities and multilingual applications in LLMs.

Related papers

Probing Large Language Models in Reasoning and Translating Complex Linguistic Puzzles [0.6144680854063939]
This paper investigates the utilization of Large Language Models (LLMs) for solving complex linguistic puzzles. Using datasets from the Puzzling Machine Competition and various Linguistics Olympiads, we employ a comprehensive set of metrics to assess the performance of GPT-4 0603.
arXiv Detail & Related papers (2025-02-02T14:53:14Z)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications. The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard. We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z)
Language Fusion for Parameter-Efficient Cross-lingual Transfer [21.96231169571248]
Fusion forLanguage Representations (FLARE) is a novel method that enhances representation quality and downstream performance for languages other than English. FLARE integrates source and target language representations within low-rank (LoRA) adapters using lightweight linear transformations. A series of experiments across representative cross-lingual natural language understanding tasks, including natural language inference, question-answering and sentiment analysis, demonstrate FLARE's effectiveness.
arXiv Detail & Related papers (2025-01-12T18:02:29Z)
ADC: Enhancing Function Calling Via Adversarial Datasets and Code Line-Level Feedback [27.197208975799334]
Large Language Models (LLMs) have made significant strides in Natural Language Processing and coding, yet they struggle with robustness and accuracy in complex function calls. This paper introduces ADC, an innovative approach that enhances LLMs' ability to follow function formats and match complex parameters.
arXiv Detail & Related papers (2024-12-23T18:07:18Z)
Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs [31.961168273386757]
Alopex is a framework that enables precise on-device function calls using the Fox Large Language Models. A data mixing strategy is used to mitigate catastrophic forgetting, combining function call data with textbook datasets to enhance performance in various tasks.
arXiv Detail & Related papers (2024-11-07T22:15:17Z)
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks [0.8425561594225592]
This study introduces a novel framework for training smaller language models in function calling. It focuses on specific logical and mathematical reasoning tasks. The approach aims to improve performances of small-scale models for these tasks using function calling.
arXiv Detail & Related papers (2024-10-24T16:27:35Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators? [46.43162333819418]
Large Language Models (LLMs) have garnered significant attention due to their remarkable ability to process information across various languages. Despite their capabilities, they exhibit inconsistencies in handling identical queries in different languages, presenting challenges for further advancement. This paper introduces a method to enhance the multilingual performance of LLMs by aggregating knowledge from diverse languages.
arXiv Detail & Related papers (2024-06-20T20:32:53Z)
CLAIM Your Data: Enhancing Imputation Accuracy with Contextual Large Language Models [0.18416014644193068]
This paper introduces the Contextual Language model for Accurate Imputation Method (CLAIM) Unlike traditional imputation methods, CLAIM utilizes contextually relevant natural language descriptors to fill missing values. Our evaluations across diverse datasets and missingness patterns reveal CLAIM's superior performance over existing imputation techniques.
arXiv Detail & Related papers (2024-05-28T00:08:29Z)
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs) Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z)
Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks. We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset. To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z)
Offline RL for Natural Language Generation with Implicit Language Q Learning [87.76695816348027]
Large language models can be inconsistent when it comes to completing user specified tasks. We propose a novel RL method, that combines both the flexible utility framework of RL with the ability of supervised learning. In addition to empirically validating ILQL, we present a detailed empirical analysis situations where offline RL can be useful in natural language generation settings.
arXiv Detail & Related papers (2022-06-05T18:38:42Z)
Transducer-based language embedding for spoken language identification [38.60303603000269]
The acoustic and linguistic features are important cues for the spoken language identification task. Recent advanced LID systems mainly use acoustic features that lack the usage of explicit linguistic feature encoding. We propose a novel transducer-based language embedding approach for LID tasks by integrating an RNN transducer model into a language embedding framework.
arXiv Detail & Related papers (2022-04-08T07:23:43Z)
A Framework of Meta Functional Learning for Regularising Knowledge Transfer [89.74127682599898]
This work proposes a novel framework of Meta Functional Learning (MFL) by meta-learning a generalisable functional model from data-rich tasks. The MFL computes meta-knowledge on functional regularisation generalisable to different learning tasks by which functional training on limited labelled data promotes more discriminative functions to be learned.
arXiv Detail & Related papers (2022-03-28T15:24:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.