Related papers: Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning

Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning

URL: http://arxiv.org/abs/2409.20258v1
Date: Mon, 30 Sep 2024 12:49:10 GMT
Title: Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning
Authors: Junlin Lu, Patrick Mannion, Karl Mason,
Abstract summary: This research proposes a dynamic weight-based preference inference algorithm. It can infer the preferences of agents acting in multi-objective decision-making problems from demonstrations.
Score: 2.9845592719739127
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Many decision-making problems feature multiple objectives where it is not always possible to know the preferences of a human or agent decision-maker for different objectives. However, demonstrated behaviors from the decision-maker are often available. This research proposes a dynamic weight-based preference inference (DWPI) algorithm that can infer the preferences of agents acting in multi-objective decision-making problems from demonstrations. The proposed algorithm is evaluated on three multi-objective Markov decision processes: Deep Sea Treasure, Traffic, and Item Gathering, and is compared to two existing preference inference algorithms. Empirical results demonstrate significant improvements compared to the baseline algorithms, in terms of both time efficiency and inference accuracy. The DWPI algorithm maintains its performance when inferring preferences for sub-optimal demonstrations. Moreover, the DWPI algorithm does not necessitate any interactions with the user during inference - only demonstrations are required. We provide a correctness proof and complexity analysis of the algorithm and statistically evaluate the performance under different representation of demonstrations.

Related papers

Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
A Novel Pareto-optimal Ranking Method for Comparing Multi-objective Optimization Algorithms [2.889178722750616]
This paper proposes a novel multi-metric comparison method to rank the performance of multi-/many-objective optimization algorithms. Four different techniques are proposed to rank algorithms based on their contribution at each Pareto level. The techniques have broad applications in science and engineering, particularly in areas where multiple metrics are used for comparisons.
arXiv Detail & Related papers (2024-11-27T02:34:54Z)
Comparative Analysis of Demonstration Selection Algorithms for LLM In-Context Learning [18.58278188791548]
In-context learning can help Large Language Models (LLMs) to adapt new tasks without additional training. Despite all the proposed demonstration selection algorithms, efficiency and effectiveness remain unclear. This lack of clarity makes it difficult to apply these algorithms in real-world scenarios.
arXiv Detail & Related papers (2024-10-30T15:11:58Z)
Bidirectional Decoding: Improving Action Chunking via Closed-Loop Resampling [51.38330727868982]
Bidirectional Decoding (BID) is a test-time inference algorithm that bridges action chunking with closed-loop operations. We show that BID boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z)
Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance. Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z)
Preference Inference from Demonstration in Multi-objective Multi-agent Decision Making [0.0]
We propose an algorithm to infer linear preference weights from either optimal or near-optimal demonstrations. Empirical results demonstrate significant improvements compared to the baseline algorithms. In future work, we plan to evaluate the algorithm's effectiveness in a multi-agent system.
arXiv Detail & Related papers (2023-04-27T12:19:28Z)
Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning: A Dynamic Weight-based Approach [0.0]
In multi-objective decision-making, preference inference is the process of inferring the preferences of a decision-maker for different objectives. This research proposes a Dynamic Weight-based Preference Inference algorithm that can infer the preferences of agents acting in multi-objective decision-making problems.
arXiv Detail & Related papers (2023-04-27T11:55:07Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations. We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning. We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z)
Uncertainty-Aware Search Framework for Multi-Objective Bayesian Optimization [40.40632890861706]
We consider the problem of multi-objective (MO) blackbox optimization using expensive function evaluations. We propose a novel uncertainty-aware search framework referred to as USeMO to efficiently select the sequence of inputs for evaluation.
arXiv Detail & Related papers (2022-04-12T16:50:48Z)
An Efficient Multi-Indicator and Many-Objective Optimization Algorithm based on Two-Archive [7.7415390727490445]
This paper proposes an indicator-based multi-objective optimization algorithm based on two-archive (SRA3) It can efficiently select good individuals in environment selection based on indicators performance and uses an adaptive parameter strategy for parental selection without setting additional parameters. Experiments on the DTLZ and WFG problems show that SRA3 has good convergence and diversity while maintaining high efficiency.
arXiv Detail & Related papers (2022-01-14T13:09:50Z)
A survey on multi-objective hyperparameter optimization algorithms for Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms. We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both. We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z)
Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms. We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.