Integrating Human Expertise in Continuous Spaces: A Novel Interactive
Bayesian Optimization Framework with Preference Expected Improvement
- URL: http://arxiv.org/abs/2401.12662v1
- Date: Tue, 23 Jan 2024 11:14:59 GMT
- Title: Integrating Human Expertise in Continuous Spaces: A Novel Interactive
Bayesian Optimization Framework with Preference Expected Improvement
- Authors: Nikolaus Feith, Elmar Rueckert
- Abstract summary: Interactive Machine Learning (IML) seeks to integrate human expertise into machine learning processes.
We propose a novel framework based on Bayesian Optimization (BO)
BO enables collaboration between machine learning algorithms and humans.
- Score: 0.5148939336441986
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Interactive Machine Learning (IML) seeks to integrate human expertise into
machine learning processes. However, most existing algorithms cannot be applied
to Realworld Scenarios because their state spaces and/or action spaces are
limited to discrete values. Furthermore, the interaction of all existing
methods is restricted to deciding between multiple proposals. We therefore
propose a novel framework based on Bayesian Optimization (BO). Interactive
Bayesian Optimization (IBO) enables collaboration between machine learning
algorithms and humans. This framework captures user preferences and provides an
interface for users to shape the strategy by hand. Additionally, we've
incorporated a new acquisition function, Preference Expected Improvement (PEI),
to refine the system's efficiency using a probabilistic model of the user
preferences. Our approach is geared towards ensuring that machines can benefit
from human expertise, aiming for a more aligned and effective learning process.
In the course of this work, we applied our method to simulations and in a real
world task using a Franka Panda robot to show human-robot collaboration.
Related papers
- Improving User Experience in Preference-Based Optimization of Reward Functions for Assistive Robots [5.523009758632668]
We show that CMA-ES-IG prioritizes the user's experience of the preference learning process.
We show that users find our algorithm more intuitive than previous approaches across both physical and social robot tasks.
arXiv Detail & Related papers (2024-11-17T21:52:58Z) - Reinforced In-Context Black-Box Optimization [64.25546325063272]
RIBBO is a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion.
RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks.
Central to our method is to augment the optimization histories with textitregret-to-go tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories.
arXiv Detail & Related papers (2024-02-27T11:32:14Z) - Enhanced Bayesian Optimization via Preferential Modeling of Abstract
Properties [49.351577714596544]
We propose a human-AI collaborative Bayesian framework to incorporate expert preferences about unmeasured abstract properties into surrogate modeling.
We provide an efficient strategy that can also handle any incorrect/misleading expert bias in preferential judgments.
arXiv Detail & Related papers (2024-02-27T09:23:13Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Optimal Cost-Preference Trade-off Planning with Multiple Temporal Tasks [3.655021726150368]
We introduce a novel notion of preference that provides a generalized framework to express preferences over individual tasks as well as their relations.
We perform an optimal trade-off (Pareto) analysis between behaviors that adhere to the user's preference and the ones that are resource optimal.
arXiv Detail & Related papers (2023-06-22T21:56:49Z) - Intuitive and Efficient Human-robot Collaboration via Real-time
Approximate Bayesian Inference [4.310882094628194]
Collaborative robots and end-to-end AI, promises flexible automation of human tasks in factories and warehouses.
Humans and cobots will collaborate helping each other.
For these collaborations to be effective and safe, robots need to model, predict and exploit human's intents.
arXiv Detail & Related papers (2022-05-17T23:04:44Z) - Model Predictive Control for Fluid Human-to-Robot Handovers [50.72520769938633]
Planning motions that take human comfort into account is not a part of the human-robot handover process.
We propose to generate smooth motions via an efficient model-predictive control framework.
We conduct human-to-robot handover experiments on a diverse set of objects with several users.
arXiv Detail & Related papers (2022-03-31T23:08:20Z) - Affordance Learning from Play for Sample-Efficient Policy Learning [30.701546777177555]
We use a self-supervised visual affordance model from human teleoperated play data to enable efficient policy learning and motion planning.
We combine model-based planning with model-free deep reinforcement learning to learn policies that favor the same object regions favored by people.
We find that our policies train 4x faster than the baselines and generalize better to novel objects because our visual affordance model can anticipate their affordance regions.
arXiv Detail & Related papers (2022-03-01T11:00:35Z) - Optimizing Interactive Systems via Data-Driven Objectives [70.3578528542663]
We propose an approach that infers the objective directly from observed user interactions.
These inferences can be made regardless of prior knowledge and across different types of user behavior.
We introduce Interactive System (ISO), a novel algorithm that uses these inferred objectives for optimization.
arXiv Detail & Related papers (2020-06-19T20:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.