AI Online Filters to Real World Image Recognition
- URL: http://arxiv.org/abs/2002.08242v1
- Date: Tue, 11 Feb 2020 08:23:14 GMT
- Title: AI Online Filters to Real World Image Recognition
- Authors: Hai Xiao, Jin Shang and Mengyuan Huang
- Abstract summary: We study a novel approach to add reinforcement controls onto the image recognition reflex models to attain better overall performance.
Follow a common infrastructure with environment sensing and AI based modeling of self-adaptive agents, we implement multiple types of AI control agents.
- Score: 4.874719076317905
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep artificial neural networks, trained with labeled data sets are widely
used in numerous vision and robotics applications today. In terms of AI, these
are called reflex models, referring to the fact that they do not self-evolve or
actively adapt to environmental changes. As demand for intelligent robot
control expands to many high level tasks, reinforcement learning and state
based models play an increasingly important role. Herein, in computer vision
and robotics domain, we study a novel approach to add reinforcement controls
onto the image recognition reflex models to attain better overall performance,
specifically to a wider environment range beyond what is expected of the task
reflex models. Follow a common infrastructure with environment sensing and AI
based modeling of self-adaptive agents, we implement multiple types of AI
control agents. To the end, we provide comparative results of these agents with
baseline, and an insightful analysis of their benefit to improve overall image
recognition performance in real world.
Related papers
- Personalized Artificial General Intelligence (AGI) via Neuroscience-Inspired Continuous Learning Systems [3.764721243654025]
Current approaches largely depend on expanding model parameters, which improves task-specific performance but falls short in enabling continuous, adaptable, and generalized learning.
This paper reviews the state of continual learning and neuroscience-inspired AI, and proposes a novel architecture for Personalized AGI that integrates brain-like learning mechanisms for edge deployment.
Building on these insights, we outline an AI architecture that features complementary fast-and-slow learning modules, synaptic self-optimization, and memory-efficient model updates to support on-device lifelong adaptation.
arXiv Detail & Related papers (2025-04-27T16:10:17Z) - Research and Design on Intelligent Recognition of Unordered Targets for Robots Based on Reinforcement Learning [6.3630131513288966]
This study proposes an AI - based intelligent robot disordered target recognition method using reinforcement learning.
The enhanced target images are input into a deep reinforcement learning model for training, ultimately enabling the AI - based intelligent robot to efficiently recognize disordered targets.
Experimental results show that the proposed method can not only significantly improve the quality of target images but also enable the AI - based intelligent robot to complete the recognition task of disordered targets with higher efficiency and accuracy.
arXiv Detail & Related papers (2025-03-10T13:53:22Z) - Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
This work advances model-based reinforcement learning by addressing the challenges of long-horizon prediction, error accumulation, and sim-to-real transfer.
By providing a scalable and robust framework, the introduced methods pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z) - GRAPPA: Generalizing and Adapting Robot Policies via Online Agentic Guidance [15.774237279917594]
We propose an agentic framework for robot self-guidance and self-improvement.
Our framework iteratively grounds a base robot policy to relevant objects in the environment.
We demonstrate that our approach can effectively guide manipulation policies to achieve significantly higher success rates.
arXiv Detail & Related papers (2024-10-09T02:00:37Z) - A Survey on Vision-Language-Action Models for Embodied AI [71.16123093739932]
Vision-language-action models (VLAs) have become a foundational element in robot learning.
Various methods have been proposed to enhance traits such as versatility, dexterity, and generalizability.
VLAs serve as high-level task planners capable of decomposing long-horizon tasks into executable subtasks.
arXiv Detail & Related papers (2024-05-23T01:43:54Z) - An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents.
Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction.
We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z) - Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - Neural architecture impact on identifying temporally extended
Reinforcement Learning tasks [0.0]
We present Attention based architectures in reinforcement learning (RL) domain, capable of performing well on OpenAI Gym Atari- 2600 game suite.
In Attention based models, extracting and overlaying of attention map onto images allows for direct observation of information used by agent to select actions.
In addition, motivated by recent developments in attention based video-classification models using Vision Transformer, we come up with an architecture based on Vision Transformer, for image-based RL domain too.
arXiv Detail & Related papers (2023-10-04T21:09:19Z) - Human-oriented Representation Learning for Robotic Manipulation [64.59499047836637]
Humans inherently possess generalizable visual representations that empower them to efficiently explore and interact with the environments in manipulation tasks.
We formalize this idea through the lens of human-oriented multi-task fine-tuning on top of pre-trained visual encoders.
Our Task Fusion Decoder consistently improves the representation of three state-of-the-art visual encoders for downstream manipulation policy-learning.
arXiv Detail & Related papers (2023-10-04T17:59:38Z) - ArK: Augmented Reality with Knowledge Interactive Emergent Ability [115.72679420999535]
We develop an infinite agent that learns to transfer knowledge memory from general foundation models to novel domains.
The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK)
We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes.
arXiv Detail & Related papers (2023-05-01T17:57:01Z) - Masked World Models for Visual Control [90.13638482124567]
We introduce a visual model-based RL framework that decouples visual representation learning and dynamics learning.
We demonstrate that our approach achieves state-of-the-art performance on a variety of visual robotic tasks.
arXiv Detail & Related papers (2022-06-28T18:42:27Z) - Low Dimensional State Representation Learning with Robotics Priors in
Continuous Action Spaces [8.692025477306212]
Reinforcement learning algorithms have proven to be capable of solving complicated robotics tasks in an end-to-end fashion.
We propose a framework combining the learning of a low-dimensional state representation, from high-dimensional observations coming from the robot's raw sensory readings, with the learning of the optimal policy.
arXiv Detail & Related papers (2021-07-04T15:42:01Z) - The AI Arena: A Framework for Distributed Multi-Agent Reinforcement
Learning [0.3437656066916039]
We introduce the AI Arena: a scalable framework with flexible abstractions for distributed multi-agent reinforcement learning.
We show performance gains due to a distributed multi-agent learning approach over commonly-used RL techniques in several different learning environments.
arXiv Detail & Related papers (2021-03-09T22:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.