Related papers: Greedy Selection under Independent Increments: A Toy Model Analysis

Related papers

A Principled Approach to Randomized Selection under Uncertainty: Applications to Peer Review and Grant Funding [68.43987626137512]
We propose a principled framework for randomized decision-making based on interval estimates of the quality of each item.<n>We introduce MERIT, an optimization-based method that maximizes the worst-case expected number of top candidates selected.<n>We prove that MERIT satisfies desirable axiomatic properties not guaranteed by existing approaches.
arXiv Detail & Related papers (2025-06-23T19:59:30Z)
Scalable Best-of-N Selection for Large Language Models via Self-Certainty [65.31658824274894]
Best-of-N selection is a key technique for improving the reasoning performance of Large Language Models.<n>We propose self-certainty, a novel and efficient metric to estimate response quality without requiring external reward models.<n>Our findings establish self-certainty as a practical and efficient way for improving LLM reasoning capabilities.
arXiv Detail & Related papers (2025-02-25T19:08:07Z)
Option-ID Based Elimination For Multiple Choice Questions [12.30777266124562]
Multiple choice questions (MCQs) are a popular and important task for evaluating large language models (LLMs)<n>This paper proposes a novel option-ID based PoE ($textPoE_textID$)
arXiv Detail & Related papers (2025-01-25T11:06:37Z)
An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences. We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration. Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z)
Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process. vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner. We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z)
Learning to Select and Rank from Choice-Based Feedback: A Simple Nested Approach [10.293894471295205]
We study a ranking and selection problem of learning from choice-based feedback with dynamic assortments.<n>We present novel and simple algorithms for both learning goals.
arXiv Detail & Related papers (2023-07-13T05:05:30Z)
Finding Counterfactually Optimal Action Sequences in Continuous State Spaces [22.84932480886562]
We describe a sequence of discrete actions and continuous states using finite horizon Markov decision processes. We then develop a search method based on the $A*$ continuity of the environment's dynamics. Our method is very efficient in practice, and it has the potential to offer interesting insights for sequential decision making tasks.
arXiv Detail & Related papers (2023-06-06T18:00:29Z)
Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment. Standard sampling-based algorithms can be used to search for solutions in the robots' joint space. We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods. We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z)
Leveraging Importance Weights in Subset Selection [45.54597544672441]
We present a subset selection algorithm designed to work with arbitrary model families in a practical batch setting. Our algorithm, IWeS, selects examples by importance sampling where the sampling probability assigned to each example is based on the entropy of models trained on previously selected batches.
arXiv Detail & Related papers (2023-01-28T02:07:31Z)
Probabilistic Bilevel Coreset Selection [24.874967723659022]
We propose a continuous probabilistic bilevel formulation of coreset selection by learning a probablistic weight for each training sample. We develop an efficient solver to the bilevel optimization problem via unbiased policy gradient without trouble of implicit differentiation.
arXiv Detail & Related papers (2023-01-24T09:37:00Z)
Non-Stochastic CDF Estimation Using Threshold Queries [3.6576781735746513]
We tackle the problem of estimating an empirical distribution in a setting with two challenges. First, the algorithm does not directly observe the data; instead, it only asks a limited number of threshold queries about each sample. Second, the data are not assumed to be independent and identically distributed; instead, we allow for an arbitrary process generating the samples.
arXiv Detail & Related papers (2023-01-13T18:00:57Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.