AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation
- URL: http://arxiv.org/abs/2403.10171v2
- Date: Mon, 27 May 2024 05:03:09 GMT
- Title: AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation
- Authors: Arkajit Datta, Tushar Verma, Rajat Chawla, Mukunda N. S, Ishaan Bhola,
- Abstract summary: Autonomous User-interface Transformation through Online Neuro-graphic Operations and Deep Exploration.
Our engine empowers agents to comprehend and implement complex, adapting to dynamic web environments with unparalleled efficiency.
The versatility and efficacy of AUTONODE are demonstrated through a series of experiments, highlighting its proficiency in managing a diverse array of web-based tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent advancements within the domain of Large Language Models (LLMs), there has been a notable emergence of agents capable of addressing Robotic Process Automation (RPA) challenges through enhanced cognitive capabilities and sophisticated reasoning. This development heralds a new era of scalability and human-like adaptability in goal attainment. In this context, we introduce AUTONODE (Autonomous User-interface Transformation through Online Neuro-graphic Operations and Deep Exploration). AUTONODE employs advanced neuro-graphical techniques to facilitate autonomous navigation and task execution on web interfaces, thereby obviating the necessity for predefined scripts or manual intervention. Our engine empowers agents to comprehend and implement complex workflows, adapting to dynamic web environments with unparalleled efficiency. Our methodology synergizes cognitive functionalities with robotic automation, endowing AUTONODE with the ability to learn from experience. We have integrated an exploratory module, DoRA (Discovery and mapping Operation for graph Retrieval Agent), which is instrumental in constructing a knowledge graph that the engine utilizes to optimize its actions and achieve objectives with minimal supervision. The versatility and efficacy of AUTONODE are demonstrated through a series of experiments, highlighting its proficiency in managing a diverse array of web-based tasks, ranging from data extraction to transaction processing.
Related papers
- Neuro-LIFT: A Neuromorphic, LLM-based Interactive Framework for Autonomous Drone FlighT at the Edge [9.461346539158475]
We present Neuro-LIFT, a real-time neuromorphic navigation framework implemented on a Parrot Bebop quadrotor2.
Our framework translates human speech into high-level planning commands which are then autonomously executed using event-based neuromorphic vision and physics-driven planning.
Our framework demonstrates its capabilities in navigating in a dynamic environment, avoiding obstacles, and adapting to human instructions in real-time.
arXiv Detail & Related papers (2025-01-31T16:17:03Z) - AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds [12.464941027105306]
AI for IT Operations (AIOps) aims to automate complex operational tasks, such as fault localization and root cause analysis, to reduce human workload and minimize customer impact.
Recent advances in Large Language Models (LLMs) and AI agents are revolutionizing AIOps by enabling end-to-end and multitask automation.
We present AIOPSLAB, a framework that deploys microservice cloud environments, injects faults, generates workloads, and exports telemetry data but also orchestrates these components and provides interfaces for interacting with and evaluating agents.
arXiv Detail & Related papers (2025-01-12T04:17:39Z) - AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials [53.376263056033046]
We propose a scalable data synthesis pipeline that generates high-quality GUI agent trajectories by leveraging web tutorials.
Our method automatically gathers tutorial-like texts from the internet, transforms them into task goals with step-by-step instructions, and employs a visual-language model agent.
A VLM-based evaluator ensures the correctness of the generated trajectories.
arXiv Detail & Related papers (2024-12-12T18:59:27Z) - Collaborative Instance Navigation: Leveraging Agent Self-Dialogue to Minimize User Input [54.81155589931697]
We propose a new task, Collaborative Instance Navigation (CoIN), with dynamic agent-human interaction during navigation.
To address CoIN, we propose a novel method, Agent-user Interaction with UncerTainty Awareness (AIUTA)
AIUTA achieves competitive performance in instance navigation against state-of-the-art methods, demonstrating great flexibility in handling user inputs.
arXiv Detail & Related papers (2024-12-02T08:16:38Z) - Imperative Learning: A Self-supervised Neuro-Symbolic Learning Framework for Robot Autonomy [31.818923556912495]
We introduce a new self-supervised neuro-symbolic (NeSy) computational framework, imperative learning (IL) for robot autonomy.
We formulate IL as a special bilevel optimization (BLO) which enables reciprocal learning over the three modules.
We show that IL can significantly enhance robot autonomy capabilities and we anticipate that it will catalyze further research across diverse domains.
arXiv Detail & Related papers (2024-06-23T12:02:17Z) - An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents.
Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction.
We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z) - Incremental procedural and sensorimotor learning in cognitive humanoid
robots [52.77024349608834]
This work presents a cognitive agent that can learn procedures incrementally.
We show the cognitive functions required in each substage and how adding new functions helps address tasks previously unsolved by the agent.
Results show that this approach is capable of solving complex tasks incrementally.
arXiv Detail & Related papers (2023-04-30T22:51:31Z) - Deep Active Learning for Computer Vision: Past and Future [50.19394935978135]
Despite its indispensable role for developing AI models, research on active learning is not as intensive as other research directions.
By addressing data automation challenges and coping with automated machine learning systems, active learning will facilitate democratization of AI technologies.
arXiv Detail & Related papers (2022-11-27T13:07:14Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - Modular approach to data preprocessing in ALOHA and application to a
smart industry use case [0.0]
The paper addresses a modular approach, integrated into the ALOHA tool flow, to support the data preprocessing and transformation pipeline.
To demonstrate the effectiveness of the approach, we present some experimental results related to a keyword spotting use case.
arXiv Detail & Related papers (2021-02-02T06:48:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.