Robust Information Selection for Hypothesis Testing with Misclassification Penalties
- URL: http://arxiv.org/abs/2502.14738v1
- Date: Thu, 20 Feb 2025 17:05:27 GMT
- Title: Robust Information Selection for Hypothesis Testing with Misclassification Penalties
- Authors: Jayanth Bhargav, Shreyas Sundaram, Mahsa Ghasemi,
- Abstract summary: We study the problem of robust information selection for a Bayesian hypothesis testing / classification task.
The goal is to identify the true state of the world from a finite set of hypotheses based on observations from selected information sources.
We introduce a novel misclassification penalty framework, which enables non-uniform treatment of different misclassification events.
- Score: 3.3444620077119436
- License:
- Abstract: We study the problem of robust information selection for a Bayesian hypothesis testing / classification task, where the goal is to identify the true state of the world from a finite set of hypotheses based on observations from the selected information sources. We introduce a novel misclassification penalty framework, which enables non-uniform treatment of different misclassification events. Extending the classical subset selection framework, we study the problem of selecting a subset of sources that minimize the maximum penalty of misclassification under a limited budget, despite deletions or failures of a subset of the selected sources. We characterize the curvature properties of the objective function and propose an efficient greedy algorithm with performance guarantees. Next, we highlight certain limitations of optimizing for the maximum penalty metric and propose a submodular surrogate metric to guide the selection of the information set. We propose a greedy algorithm with near-optimality guarantees for optimizing the surrogate metric. Finally, we empirically demonstrate the performance of our proposed algorithms in several instances of the information set selection problem.
Related papers
- An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - Submodular Information Selection for Hypothesis Testing with Misclassification Penalties [3.3444620077119436]
We study the problem of selecting an optimal subset of information sources for a hypothesis testing/classification task.
We propose a misclassification penalty framework, which enables nonuniform treatment of different misclassification errors.
We prove that this metric is submodular and establish near-optimal guarantees for the greedy algorithms for both the information set selection problems.
arXiv Detail & Related papers (2024-05-17T17:31:02Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Bilevel Optimization for Feature Selection in the Data-Driven Newsvendor
Problem [8.281391209717105]
We study the feature-based news vendor problem, in which a decision-maker has access to historical data.
In this setting, we investigate feature selection, aiming to derive sparse, explainable models with improved out-of-sample performance.
We present a mixed integer linear program reformulation for the bilevel program, which can be solved to optimality with standard optimization solvers.
arXiv Detail & Related papers (2022-09-12T08:52:26Z) - Bi-objective Ranking and Selection Using Stochastic Kriging [0.0]
We consider bi-objective ranking and selection problems in which the two objective outcomes have been observed with uncertainty.
We propose a novel Bayesian bi-objective ranking and selection method that sequentially allocates extra samples to competitive solutions.
Experimental results show that the proposed method outperforms the standard allocation method, as well as a well-known state-of-the-art algorithm.
arXiv Detail & Related papers (2022-09-05T23:51:07Z) - Stop Overcomplicating Selective Classification: Use Max-Logit [2.3677503557659705]
We tackle the problem of Selective Classification where the goal is to achieve the best performance on the desired coverages of the dataset.
Recent state-of-the-art selective methods come with architectural changes either via introducing a separate selection head or an extra abstention logit.
We present surprising results for Selective Classification by confirming that the superior performance of state-of-the-art methods is owed to training a more generalizable classifier.
arXiv Detail & Related papers (2022-06-17T22:23:11Z) - Fast Feature Selection with Fairness Constraints [49.142308856826396]
We study the fundamental problem of selecting optimal features for model construction.
This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants.
We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions.
The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
arXiv Detail & Related papers (2022-02-28T12:26:47Z) - A Discriminative Technique for Multiple-Source Adaptation [55.5865665284915]
We present a new discriminative technique for the multiple-source adaptation, MSA, problem.
Our solution only requires conditional probabilities that can easily be accurately estimated from unlabeled data from the source domains.
Our experiments with real-world applications further demonstrate that our new discriminative MSA algorithm outperforms the previous generative solution.
arXiv Detail & Related papers (2020-08-25T14:06:15Z) - Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms.
We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z) - Outlier Detection Ensemble with Embedded Feature Selection [42.8338013000469]
We propose an outlier detection ensemble framework with embedded feature selection (ODEFS)
For each random sub-sampling based learning component, ODEFS unifies feature selection and outlier detection into a pairwise ranking formulation.
We adopt the thresholded self-paced learning to simultaneously optimize feature selection and example selection.
arXiv Detail & Related papers (2020-01-15T13:14:10Z) - Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback.
We devise an algorithm with a minimal cluster recovery error rate.
For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.