Two-stage Risk Control with Application to Ranked Retrieval
- URL: http://arxiv.org/abs/2404.17769v3
- Date: Sat, 01 Feb 2025 11:49:02 GMT
- Title: Two-stage Risk Control with Application to Ranked Retrieval
- Authors: Yunpeng Xu, Mufang Ying, Wenge Guo, Zhi Wei,
- Abstract summary: We develop two-stage risk control methods based on the proposed learn-then-test (LTT) and conformal risk control (CRC) frameworks.
We provide theoretical guarantees for our proposed methods and design novel loss functions tailored for ranked retrieval tasks.
The effectiveness of our approach is validated through experiments on two large-scale, widely-used datasets.
- Score: 1.8481458455172357
- License:
- Abstract: Practical machine learning systems often operate in multiple sequential stages, as seen in ranking and recommendation systems, which typically include a retrieval phase followed by a ranking phase. Effectively assessing prediction uncertainty and ensuring effective risk control in such systems pose significant challenges due to their inherent complexity. To address these challenges, we developed two-stage risk control methods based on the recently proposed learn-then-test (LTT) and conformal risk control (CRC) frameworks. Unlike the methods in prior work that address multiple risks, our approach leverages the sequential nature of the problem, resulting in reduced computational burden. We provide theoretical guarantees for our proposed methods and design novel loss functions tailored for ranked retrieval tasks. The effectiveness of our approach is validated through experiments on two large-scale, widely-used datasets: MSLR-Web and Yahoo LTRC.
Related papers
- Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions.
We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z) - Conformal Risk Control for Ordinal Classification [2.0189665663352936]
We seek to control the conformal risk in expectation for ordinal classification tasks, which have broad applications to many real problems.
We proposed two types of loss functions specially designed for ordinal classification tasks, and developed corresponding algorithms to determine the prediction set for each case.
We demonstrated the effectiveness of our proposed methods, and analyzed the difference between the two types of risks on three different datasets.
arXiv Detail & Related papers (2024-05-01T09:55:31Z) - Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for
Robotics Control with Action Constraints [9.293472255463454]
This study presents a benchmark for evaluating action-constrained reinforcement learning (RL) algorithms.
We evaluate existing algorithms and their novel variants across multiple robotics control environments.
arXiv Detail & Related papers (2023-04-18T05:45:09Z) - Learning Disturbances Online for Risk-Aware Control: Risk-Aware Flight
with Less Than One Minute of Data [33.7789991023177]
Recent advances in safety-critical risk-aware control are predicated on apriori knowledge of disturbances a system might face.
This paper proposes a method to efficiently learn these disturbances in a risk-aware online context.
arXiv Detail & Related papers (2022-12-12T21:40:23Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Deep Learning for Systemic Risk Measures [3.274367403737527]
The aim of this paper is to study a new methodological framework for systemic risk measures.
Under this new framework, systemic risk measures can be interpreted as the minimal amount of cash that secures the aggregated system.
Deep learning is increasingly receiving attention in financial modelings and risk management.
arXiv Detail & Related papers (2022-07-02T05:01:19Z) - TOPS: Transition-based VOlatility-controlled Policy Search and its
Global Convergence [9.607937067646617]
This paper proposes Transition-based VOlatility-controlled Policy Search (TOPS)
It is a novel algorithm that solves risk-averse problems by learning from (possibly non-consecutive) transitions instead of only consecutive trajectories.
Both theoretical analysis and experimental results demonstrate a state-of-the-art level of risk-averse policy search methods.
arXiv Detail & Related papers (2022-01-24T18:29:23Z) - Supervised Advantage Actor-Critic for Recommender Systems [76.7066594130961]
We propose negative sampling strategy for training the RL component and combine it with supervised sequential learning.
Based on sampled (negative) actions (items), we can calculate the "advantage" of a positive action over the average case.
We instantiate SNQN and SA2C with four state-of-the-art sequential recommendation models and conduct experiments on two real-world datasets.
arXiv Detail & Related papers (2021-11-05T12:51:15Z) - A Regret Minimization Approach to Iterative Learning Control [61.37088759497583]
We propose a new performance metric, planning regret, which replaces the standard uncertainty assumptions with worst case regret.
We provide theoretical and empirical evidence that the proposed algorithm outperforms existing methods on several benchmarks.
arXiv Detail & Related papers (2021-02-26T13:48:49Z) - Towards Safe Policy Improvement for Non-Stationary MDPs [48.9966576179679]
Many real-world problems of interest exhibit non-stationarity, and when stakes are high, the cost associated with a false stationarity assumption may be unacceptable.
We take the first steps towards ensuring safety, with high confidence, for smoothly-varying non-stationary decision problems.
Our proposed method extends a type of safe algorithm, called a Seldonian algorithm, through a synthesis of model-free reinforcement learning with time-series analysis.
arXiv Detail & Related papers (2020-10-23T20:13:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.