Feature Acquisition using Monte Carlo Tree Search
- URL: http://arxiv.org/abs/2212.11360v1
- Date: Wed, 21 Dec 2022 20:53:44 GMT
- Title: Feature Acquisition using Monte Carlo Tree Search
- Authors: Sungsoo Lim, Diego Klabjan, Mark Shapiro
- Abstract summary: Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models.
Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences.
In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs, and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search.
- Score: 18.76745359031975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature acquisition algorithms address the problem of acquiring informative
features while balancing the costs of acquisition to improve the learning
performances of ML models. Previous approaches have focused on calculating the
expected utility values of features to determine the acquisition sequences.
Other approaches formulated the problem as a Markov Decision Process (MDP) and
applied reinforcement learning based algorithms. In comparison to previous
approaches, we focus on 1) formulating the feature acquisition problem as a MDP
and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards
for each acquisition step based on model improvements and acquisition costs and
3) simultaneously optimizing model improvement and acquisition costs with
multi-objective Monte Carlo Tree Search. With Proximal Policy Optimization and
Deep Q-Network algorithms as benchmark, we show the effectiveness of our
proposed approach with experimental study.
Related papers
- LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - $\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking
Reinforcement Learning Algorithms in Generative Language Model [50.636423457653066]
We present a generalized version of the 24-Puzzle: the $(N,K)$-Puzzle, which challenges language models to reach a target value $K$ with $N$ integers.
We evaluate the effectiveness of established RL algorithms such as Proximal Policy Optimization (PPO), alongside novel approaches like Identity Policy Optimization (IPO) and Direct Policy Optimization (DPO)
arXiv Detail & Related papers (2024-03-11T22:24:14Z) - Let's reward step by step: Step-Level reward model as the Navigators for
Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase.
We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs.
To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z) - Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms.
Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Costly Features Classification using Monte Carlo Tree Search [5.188762991286163]
We consider the problem of costly feature classification, where we sequentially select the subset of features to make a balance between the classification error and the feature cost.
In this paper, we first cast the task into a MDP problem and use Advantage Actor Critic algorithm to solve it.
arXiv Detail & Related papers (2021-02-14T05:18:33Z) - Landscape-Aware Fixed-Budget Performance Regression and Algorithm
Selection for Modular CMA-ES Variants [1.0965065178451106]
We show that it is possible to achieve high-quality performance predictions with off-the-shelf supervised learning approaches.
We test this approach on a portfolio of very similar algorithms, which we choose from the family of modular CMA-ES algorithms.
arXiv Detail & Related papers (2020-06-17T13:34:57Z) - Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep
Neural Network [8.19063619210761]
We present a self-learning approach that combines deep reinforcement learning and Monte Carlo tree search to solve the traveling salesman problem.
Experimental results show that the proposed method performs favorably against other methods in small-to-medium problem settings.
It shows comparable performance as state-of-the-art in large problem setting.
arXiv Detail & Related papers (2020-05-14T11:36:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.