Related papers: ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

URL: http://arxiv.org/abs/2505.14668v1
Date: Tue, 20 May 2025 17:55:25 GMT
Title: ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions
Authors: Bufang Yang, Lilin Xu, Liekang Zeng, Kaiwei Liu, Siyang Jiang, Wenrui Lu, Hongkai Chen, Xiaofan Jiang, Guoliang Xing, Zhenyu Yan,
Abstract summary: We introduce ContextAgent, the first context-aware proactive agent.<n> ContextAgent first extracts multi-dimensional contexts from massive sensory perceptions on wearables.<n>It then leverages the sensory contexts and the persona contexts from historical data to predict the necessity for proactive services.
Score: 4.664491157185575
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in Large Language Models (LLMs) have propelled intelligent agents from reactive responses to proactive support. While promising, existing proactive agents either rely exclusively on observations from enclosed environments (e.g., desktop UIs) with direct LLM inference or employ rule-based proactive notifications, leading to suboptimal user intent understanding and limited functionality for proactive service. In this paper, we introduce ContextAgent, the first context-aware proactive agent that incorporates extensive sensory contexts to enhance the proactive capabilities of LLM agents. ContextAgent first extracts multi-dimensional contexts from massive sensory perceptions on wearables (e.g., video and audio) to understand user intentions. ContextAgent then leverages the sensory contexts and the persona contexts from historical data to predict the necessity for proactive services. When proactive assistance is needed, ContextAgent further automatically calls the necessary tools to assist users unobtrusively. To evaluate this new task, we curate ContextAgentBench, the first benchmark for evaluating context-aware proactive LLM agents, covering 1,000 samples across nine daily scenarios and twenty tools. Experiments on ContextAgentBench show that ContextAgent outperforms baselines by achieving up to 8.5% and 6.0% higher accuracy in proactive predictions and tool calling, respectively. We hope our research can inspire the development of more advanced, human-centric, proactive AI assistants.

Related papers

AgentXploit: End-to-End Redteaming of Black-Box AI Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentXploit, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentXploit on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z)
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents [75.85554113398626]
We introduce a new benchmark AgentDAM that measures if AI web-navigation agents follow the privacy principle of data minimization''<n>Our benchmark simulates realistic web interaction scenarios end-to-end and is adaptable to all existing web navigation agents.
arXiv Detail & Related papers (2025-03-12T19:30:31Z)
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents [52.13695464678006]
This study enhances an LLM-based web agent by simply refining its observation and action space. AgentOccam surpasses the previous state-of-the-art and concurrent work by 9.8 (+29.4%) and 5.9 (+15.8%) absolute points respectively.
arXiv Detail & Related papers (2024-10-17T17:50:38Z)
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance [95.03771007780976]
We tackle the challenge of developing proactive agents capable of anticipating and initiating tasks without explicit human instructions.<n>First, we collect real-world human activities to generate proactive task predictions.<n>These predictions are labeled by human annotators as either accepted or rejected.<n>The labeled data is used to train a reward model that simulates human judgment.
arXiv Detail & Related papers (2024-10-16T08:24:09Z)
Agent-as-a-Judge: Evaluate Agents with Agents [61.33974108405561]
We introduce the Agent-as-a-Judge framework, wherein agentic systems are used to evaluate agentic systems. This is an organic extension of the LLM-as-a-Judge framework, incorporating agentic features that enable intermediate feedback for the entire task-solving process. We present DevAI, a new benchmark of 55 realistic automated AI development tasks.
arXiv Detail & Related papers (2024-10-14T17:57:02Z)
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions [68.81939215223818]
ProductAgent is a conversational information seeking agent equipped with abilities of strategic clarification question generation and dynamic product retrieval. We develop the agent with strategies for product feature summarization, query generation, and product retrieval. Experiments show that ProductAgent interacts positively with the user and enhances retrieval performance with increasing dialogue turns.
arXiv Detail & Related papers (2024-07-01T03:50:23Z)
CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only [21.054681757006385]
We propose an agent that perceives its environment solely through screenshot images.<n>By leveraging the reasoning capability of the Large Language Models, we eliminate the need for large-scale human demonstration data.<n>Agent achieves an average success rate of 94.5% on MiniWoB++ and an average task score of 62.3 on WebShop.
arXiv Detail & Related papers (2024-06-11T05:21:20Z)
Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects [32.91556128291915]
This paper surveys current research to provide an in-depth overview of intelligent agents within single and multi-agent systems. It covers their definitions, research frameworks, and foundational components such as their composition, cognitive and planning methods, tool utilization, and responses to environmental feedback. We conclude by envisioning prospects for LLM-based agents, considering the evolving landscape of AI and natural language processing.
arXiv Detail & Related papers (2024-01-07T09:08:24Z)
KwaiAgents: Generalized Information-seeking Agent System with Large Language Models [33.59597020276034]
Humans excel in critical thinking, planning, reflection, and harnessing available tools to interact with and interpret the world. Recent advancements in large language models (LLMs) suggest that machines might also possess the aforementioned human-like capabilities. We introduce KwaiAgents, a generalized information-seeking agent system based on LLMs.
arXiv Detail & Related papers (2023-12-08T08:11:11Z)
Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis [4.055489363682198]
Large language models (LLMs) offer significant promise as a knowledge source for task learning. Prompt engineering has been shown to be effective for eliciting knowledge from an LLM, but alone it is insufficient for acquiring relevant, situationally grounded knowledge for an embodied agent learning novel tasks. We describe a cognitive-agent approach, STARS, that extends and complements prompt engineering, mitigating its limitations and thus enabling an agent to acquire new task knowledge matched to its native language capabilities, embodiment, environment, and user preferences.
arXiv Detail & Related papers (2023-06-11T20:50:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.