TinyAgent: Function Calling at the Edge
        - URL: http://arxiv.org/abs/2409.00608v3
- Date: Fri, 25 Oct 2024 01:16:55 GMT
- Title: TinyAgent: Function Calling at the Edge
- Authors: Lutfi Eren Erdogan, Nicholas Lee, Siddharth Jha, Sehoon Kim, Ryan Tabrizi, Suhong Moon, Coleman Hooper, Gopala Anumanchipalli, Kurt Keutzer, Amir Gholami, 
- Abstract summary: We present an end-to-end framework for training and deploying task-specific small language model agents capable of function calling for driving agentic systems at the edge.
As a driving application, we demonstrate a local Siri-like system for Apple's MacBook that can execute user commands through text or voice input.
- Score: 32.174966522801746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Recent large language models (LLMs) have enabled the development of advanced agentic systems that can integrate various tools and APIs to fulfill user queries through function calling. However, the deployment of these LLMs on the edge has not been explored since they typically require cloud-based infrastructure due to their substantial model size and computational demands. To this end, we present TinyAgent, an end-to-end framework for training and deploying task-specific small language model agents capable of function calling for driving agentic systems at the edge. We first show how to enable accurate function calling for open-source models via the LLMCompiler framework. We then systematically curate a high-quality dataset for function calling, which we use to fine-tune two small language models, TinyAgent-1.1B and 7B. For efficient inference, we introduce a novel tool retrieval method to reduce the input prompt length and utilize quantization to further accelerate the inference speed. As a driving application, we demonstrate a local Siri-like system for Apple's MacBook that can execute user commands through text or voice input. Our results show that our models can achieve, and even surpass, the function-calling capabilities of larger models like GPT-4-Turbo, while being fully deployed at the edge. We open-source our dataset, models, and installable package and provide a demo video for our MacBook assistant agent. 
 
      
        Related papers
        - Apple Intelligence Foundation Language Models: Tech Report 2025 [246.04717786298764]
 We introduce two foundation language models that power Apple Intelligence features across Apple devices and services.<n>Both models are trained on large-scale multilingual and multimodal datasets sourced via responsible web crawling.<n>A new Swift-centric Foundation Models framework exposes guided generation, constrained tool calling, and LoRA adapter fine-tuning.
 arXiv  Detail & Related papers  (2025-07-17T23:37:19Z)
- Small Models, Big Tasks: An Exploratory Empirical Study on Small   Language Models for Function Calling [6.102559098873098]
 Function calling is a complex task with widespread applications in domains such as information retrieval, software engineering and automation.
Large Language Models (LLMs) can automate this process but are computationally expensive and impractical in resource-constrained settings.
Small Language Models (SLMs) can operate efficiently, offering faster response times, and lower computational demands.
 arXiv  Detail & Related papers  (2025-04-27T15:26:51Z)
- CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order   Reasoning On Device [2.4100803794273005]
 We introduce an on-device Small Language Models (SLMs) framework designed to handle multiple user inputs and reason over personal context locally.
 CAMPHOR employs a hierarchical architecture where a high-order reasoning agent decomposes complex tasks and coordinates expert agents responsible for personal context retrieval, tool interaction, and dynamic plan generation.
By implementing parameter sharing across agents and leveraging prompt compression, we significantly reduce model size, latency, and memory usage.
 arXiv  Detail & Related papers  (2024-10-12T07:28:10Z)
- Granite-Function Calling Model: Introducing Function Calling Abilities   via Multi-task Learning of Granular Tasks [35.97890508648945]
 We introduce the-20B-FUNCTIONCALLING model under an Apache 2.0 license.
The model is trained using a multi-task training approach on seven fundamental tasks.
We show that-20B-FUNCTIONCALLING has better generalizability on multiple tasks in seven different evaluation datasets.
 arXiv  Detail & Related papers  (2024-06-27T17:47:26Z)
- Small Agent Can Also Rock! Empowering Small Language Models as   Hallucination Detector [114.88975874411142]
 Hallucination detection is a challenging task for large language models (LLMs)
We propose an autonomous LLM-based agent framework, called HaluAgent.
In HaluAgent, we integrate the LLM, multi-functional toolbox, and design a fine-grained three-stage detection framework.
 arXiv  Detail & Related papers  (2024-06-17T07:30:05Z)
- CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks   with Front-End UI Only [21.054681757006385]
 We propose an agent that perceives its environment solely through screenshot images.
By leveraging the reasoning capability of the Large Language Models, we eliminate the need for large-scale human demonstration data.
Agent achieves an average success rate of 94.5% on MiniWoB++ and an average task score of 62.3 on WebShop.
 arXiv  Detail & Related papers  (2024-06-11T05:21:20Z)
- MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain   Expertise [49.83486066403154]
 textscMimir is a streamlined platform offering a customizable pipeline for personalized agent tuning.
textscMimir supports the generation of general instruction-tuning datasets from the same input.
textscMimir integrates these features into a cohesive end-to-end platform, facilitating everything from the uploading of personalized files to one-click agent fine-tuning.
 arXiv  Detail & Related papers  (2024-04-03T23:42:38Z)
- DiffAgent: Fast and Accurate Text-to-Image API Selection with Large   Language Model [90.71963723884944]
 Text-to-image (T2I) generative models have attracted significant attention and found extensive applications within and beyond academic research.
We introduce DiffAgent, an agent designed to screen the accurate selection in seconds via API calls.
Our evaluations reveal that DiffAgent not only excels in identifying the appropriate T2I API but also underscores the effectiveness of the SFTA training framework.
 arXiv  Detail & Related papers  (2024-03-31T06:28:15Z)
- ModelScope-Agent: Building Your Customizable Agent System with
  Open-source Large Language Models [74.64651681052628]
 We introduce ModelScope-Agent, a customizable agent framework for real-world applications based on open-source LLMs as controllers.
It provides a user-friendly system library, with customizable engine design to support model training on multiple open-source LLMs.
A comprehensive framework has been proposed spanning over tool-use data collection, tool retrieval, tool registration, memory control, customized model training, and evaluation.
 arXiv  Detail & Related papers  (2023-09-02T16:50:30Z)
- Recommender AI Agent: Integrating Large Language Models for Interactive
  Recommendations [53.76682562935373]
 We introduce an efficient framework called textbfInteRecAgent, which employs LLMs as the brain and recommender models as tools.
InteRecAgent achieves satisfying performance as a conversational recommender system, outperforming general-purpose LLMs.
 arXiv  Detail & Related papers  (2023-08-31T07:36:44Z)
- Prompt2Model: Generating Deployable Models from Natural Language
  Instructions [74.19816829003729]
 Large language models (LLMs) enable system builders to create competent NLP systems through prompting.
In other ways, LLMs are a step backward from traditional special-purpose NLP models.
We propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs.
 arXiv  Detail & Related papers  (2023-08-23T17:28:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.