Related papers: Feature Acquisition using Monte Carlo Tree Search

Feature Acquisition using Monte Carlo Tree Search

URL: http://arxiv.org/abs/2212.11360v1
Date: Wed, 21 Dec 2022 20:53:44 GMT
Title: Feature Acquisition using Monte Carlo Tree Search
Authors: Sungsoo Lim, Diego Klabjan, Mark Shapiro
Abstract summary: Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs, and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search.
Score: 18.76745359031975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Feature acquisition algorithms address the problem of acquiring informative features while balancing the costs of acquisition to improve the learning performances of ML models. Previous approaches have focused on calculating the expected utility values of features to determine the acquisition sequences. Other approaches formulated the problem as a Markov Decision Process (MDP) and applied reinforcement learning based algorithms. In comparison to previous approaches, we focus on 1) formulating the feature acquisition problem as a MDP and applying Monte Carlo Tree Search, 2) calculating the intermediary rewards for each acquisition step based on model improvements and acquisition costs and 3) simultaneously optimizing model improvement and acquisition costs with multi-objective Monte Carlo Tree Search. With Proximal Policy Optimization and Deep Q-Network algorithms as benchmark, we show the effectiveness of our proposed approach with experimental study.

Related papers

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process. We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals. The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z)
$\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model [50.636423457653066]
We present a generalized version of the 24-Puzzle: the $(N,K)$-Puzzle, which challenges language models to reach a target value $K$ with $N$ integers. We evaluate the effectiveness of established RL algorithms such as Proximal Policy Optimization (PPO), alongside novel approaches like Identity Policy Optimization (IPO) and Direct Policy Optimization (DPO)
arXiv Detail & Related papers (2024-03-11T22:24:14Z)
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase. We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs. To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z)
Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms. Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
Costly Features Classification using Monte Carlo Tree Search [5.188762991286163]
We consider the problem of costly feature classification, where we sequentially select the subset of features to make a balance between the classification error and the feature cost. In this paper, we first cast the task into a MDP problem and use Advantage Actor Critic algorithm to solve it.
arXiv Detail & Related papers (2021-02-14T05:18:33Z)
Landscape-Aware Fixed-Budget Performance Regression and Algorithm Selection for Modular CMA-ES Variants [1.0965065178451106]
We show that it is possible to achieve high-quality performance predictions with off-the-shelf supervised learning approaches. We test this approach on a portfolio of very similar algorithms, which we choose from the family of modular CMA-ES algorithms.
arXiv Detail & Related papers (2020-06-17T13:34:57Z)
Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep Neural Network [8.19063619210761]
We present a self-learning approach that combines deep reinforcement learning and Monte Carlo tree search to solve the traveling salesman problem. Experimental results show that the proposed method performs favorably against other methods in small-to-medium problem settings. It shows comparable performance as state-of-the-art in large problem setting.
arXiv Detail & Related papers (2020-05-14T11:36:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.