Human Machine Co-adaption Interface via Cooperation Markov Decision
Process System
- URL: http://arxiv.org/abs/2305.02058v1
- Date: Wed, 3 May 2023 12:00:53 GMT
- Title: Human Machine Co-adaption Interface via Cooperation Markov Decision
Process System
- Authors: Kairui Guo, Adrian Cheng, Yaqi Li, Jun Li, Rob Duffield, Steven W. Su
- Abstract summary: This paper introduces the co-adaption techniques via model-based reinforcement learning.
In this study, we treat the full process of robot-assisted rehabilitation as a co-adaptive or mutual learning process.
- Score: 8.68491060014975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper aims to develop a new human-machine interface to improve
rehabilitation performance from the perspective of both the user (patient) and
the machine (robot) by introducing the co-adaption techniques via model-based
reinforcement learning. Previous studies focus more on robot assistance, i.e.,
to improve the control strategy so as to fulfill the objective of
Assist-As-Needed. In this study, we treat the full process of robot-assisted
rehabilitation as a co-adaptive or mutual learning process and emphasize the
adaptation of the user to the machine. To this end, we proposed a Co-adaptive
MDPs (CaMDPs) model to quantify the learning rates based on cooperative
multi-agent reinforcement learning (MARL) in the high abstraction layer of the
systems. We proposed several approaches to cooperatively adjust the Policy
Improvement among the two agents in the framework of Policy Iteration. Based on
the proposed co-adaptive MDPs, the simulation study indicates the
non-stationary problem can be mitigated using various proposed Policy
Improvement approaches.
Related papers
- Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback [58.049113055986375]
We develop a single stage approach named Alignment with Integrated Human Feedback (AIHF) to train reward models and the policy.
The proposed approach admits a suite of efficient algorithms, which can easily reduce to, and leverage, popular alignment algorithms.
We demonstrate the efficiency of the proposed solutions with extensive experiments involving alignment problems in LLMs and robotic control problems in MuJoCo.
arXiv Detail & Related papers (2024-06-11T01:20:53Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Let's reward step by step: Step-Level reward model as the Navigators for
Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase.
We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs.
To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z) - Self-Adaptive Large Language Model (LLM)-Based Multiagent Systems [0.0]
We propose the integration of large language models (LLMs) into multiagent systems.
We anchor our methodology on the MAPE-K model, which is renowned for its robust support in monitoring, analyzing, planning, and executing system adaptations.
arXiv Detail & Related papers (2023-07-12T14:26:46Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Multi-Task Model Personalization for Federated Supervised SVM in
Heterogeneous Networks [10.169907307499916]
Federated systems enable collaborative training on highly heterogeneous data through model personalization.
To accelerate the learning procedure for diverse participants in a multi-task federated setting, more efficient and robust methods need to be developed.
In this paper, we design an efficient iterative distributed method based on the alternating direction method of multipliers (ADMM) for support vector machines (SVMs)
The proposed method utilizes efficient computations and model exchange in a network of heterogeneous nodes and allows personalization of the learning model in the presence of non-i.i.d. data.
arXiv Detail & Related papers (2023-03-17T21:36:01Z) - Hierarchical Reinforcement Learning with Opponent Modeling for
Distributed Multi-agent Cooperation [13.670618752160594]
Deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments.
Traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search.
We propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search.
arXiv Detail & Related papers (2022-06-25T19:09:29Z) - A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement
Learning [14.128178683323108]
In this work, we propose Surrogate-assisted Controller (SC), a novel and efficient module that can be integrated into existing frameworks.
The key challenge is to prevent the optimization process from being misled by the possible false minima introduced by the surrogate.
Experiments on six continuous control tasks from the OpenAI Gym platform show that SC can not only significantly reduce the cost of fitness evaluations, but also boost the performance of the original hybrid frameworks.
arXiv Detail & Related papers (2022-01-01T06:42:51Z) - Towards Better Adaptive Systems by Combining MAPE, Control Theory, and
Machine Learning [16.998805882711864]
Two established approaches to engineer adaptive systems are architecture-based adaptation that uses a Monitor-Analysis-Planning-Executing loop, and control-based adaptation that relies on principles of control theory (CT) to realize adaptation.
We are concerned with the question of how these approaches are related with one another and whether combining them and supporting them with machine learning can produce better adaptive systems.
We motivate the combined use of different adaptation approaches using a scenario of a cloud-based enterprise system and illustrate the analysis when combining the different approaches.
arXiv Detail & Related papers (2021-03-19T15:00:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.