Human Preference-Based Learning for High-dimensional Optimization of
Exoskeleton Walking Gaits
- URL: http://arxiv.org/abs/2003.06495v2
- Date: Sat, 8 Aug 2020 16:56:22 GMT
- Title: Human Preference-Based Learning for High-dimensional Optimization of
Exoskeleton Walking Gaits
- Authors: Maegan Tucker, Myra Cheng, Ellen Novoseller, Richard Cheng, Yisong
Yue, Joel W. Burdick, and Aaron D. Ames
- Abstract summary: This work presents LineCoSpar, a human-in-the-loop preference-based framework to learn user preferences in high dimensions.
In simulations and human trials, we empirically verify that LineCoSpar is a sample-efficient approach for high-dimensional preference optimization.
This result has implications for exoskeleton gait synthesis, an active field with applications to clinical use and patient rehabilitation.
- Score: 55.59198568303196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimizing lower-body exoskeleton walking gaits for user comfort requires
understanding users' preferences over a high-dimensional gait parameter space.
However, existing preference-based learning methods have only explored
low-dimensional domains due to computational limitations. To learn user
preferences in high dimensions, this work presents LineCoSpar, a
human-in-the-loop preference-based framework that enables optimization over
many parameters by iteratively exploring one-dimensional subspaces.
Additionally, this work identifies gait attributes that characterize broader
preferences across users. In simulations and human trials, we empirically
verify that LineCoSpar is a sample-efficient approach for high-dimensional
preference optimization. Our analysis of the experimental data reveals a
correspondence between human preferences and objective measures of dynamicity,
while also highlighting differences in the utility functions underlying
individual users' gait preferences. This result has implications for
exoskeleton gait synthesis, an active field with applications to clinical use
and patient rehabilitation.
Related papers
- MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning [99.09906827676748]
We introduce MotionRL, the first approach to utilize Multi-Reward Reinforcement Learning (RL) for optimizing text-to-motion generation tasks.
Our novel approach uses reinforcement learning to fine-tune the motion generator based on human preferences prior knowledge of the human perception model.
In addition, MotionRL introduces a novel multi-objective optimization strategy to approximate optimality between text adherence, motion quality, and human preferences.
arXiv Detail & Related papers (2024-10-09T03:27:14Z) - Aligning Large Language Models with Self-generated Preference Data [72.99676237703099]
We propose a new framework that boosts the alignment of large language models (LLMs) with human preferences.
Our key idea is leveraging the human prior knowledge within the small (seed) data.
We introduce a noise-aware preference learning algorithm to mitigate the risk of low quality within generated preference data.
arXiv Detail & Related papers (2024-06-06T18:01:02Z) - Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values.
We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO)
Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z) - Enhanced Bayesian Optimization via Preferential Modeling of Abstract
Properties [49.351577714596544]
We propose a human-AI collaborative Bayesian framework to incorporate expert preferences about unmeasured abstract properties into surrogate modeling.
We provide an efficient strategy that can also handle any incorrect/misleading expert bias in preferential judgments.
arXiv Detail & Related papers (2024-02-27T09:23:13Z) - Cost-Sensitive Best Subset Selection for Logistic Regression: A
Mixed-Integer Conic Optimization Perspective [3.1468618177952785]
Key challenge in machine learning is to design interpretable models that can reduce their inputs to the best subset for making transparent predictions.
We propose a certifiably optimal feature selection procedure for logistic regression from a mixed-integer conic optimization perspective.
This allows us to systematically evaluate different and optimal cardinality- and budget-constrained feature selection procedures.
arXiv Detail & Related papers (2023-10-09T07:13:40Z) - Good practices for Bayesian Optimization of high dimensional structured
spaces [15.488642552157131]
We study the effect of different search space design choices for performing Bayesian Optimization in high dimensional structured datasets.
We evaluate new methods to automatically define the optimization bounds in the latent space.
We provide recommendations for the practitioners.
arXiv Detail & Related papers (2020-12-31T07:00:39Z) - ROIAL: Region of Interest Active Learning for Characterizing Exoskeleton
Gait Preference Landscapes [64.87637128500889]
Region of Interest Active Learning (ROIAL) framework actively learns each user's underlying utility function over a region of interest.
ROIAL learns from ordinal and preference feedback, which are more reliable feedback mechanisms than absolute numerical scores.
Results demonstrate the feasibility of recovering gait utility landscapes from limited human trials.
arXiv Detail & Related papers (2020-11-09T22:45:58Z) - Projective Preferential Bayesian Optimization [12.431251769382888]
We propose a new type of Bayesian optimization for learning user preferences in high-dimensional spaces.
We show that our framework is able to find a global minimum of a high-dimensional black-box function.
arXiv Detail & Related papers (2020-02-08T08:29:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.