Related papers: Cross-replication Reliability -- An Empirical Approach to Interpreting Inter-rater Reliability

Related papers

RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation [33.85528514353727]
We introduce the Retrieval Preference Optimization (RPO) to adaptively leverage multi-source knowledge based on retrieval relevance. RPO is the only RAG-dedicated alignment approach that quantifies the awareness of retrieval relevance in training. Experiments on four datasets demonstrate that RPO outperforms RAG by 4-10% in accuracy without any extra component.
arXiv Detail & Related papers (2025-01-23T14:58:56Z)
Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-Label Classification [120.37051160567277]
This paper proposes a novel measure named Top-K Pairwise Ranking (TKPR) A series of analyses show that TKPR is compatible with existing ranking-based measures. On the other hand, we establish a sharp generalization bound for the proposed framework based on a novel technique named data-dependent contraction.
arXiv Detail & Related papers (2024-07-09T09:36:37Z)
Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation [54.61816424792866]
We introduce a general framework on Risk-Sensitive Distributional Reinforcement Learning (RS-DisRL), with static Lipschitz Risk Measures (LRM) and general function approximation. We design two innovative meta-algorithms: textttRS-DisRL-M, a model-based strategy for model-based function approximation, and textttRS-DisRL-V, a model-free approach for general value function approximation.
arXiv Detail & Related papers (2024-02-28T08:43:18Z)
Bounding data reconstruction attacks with the hypothesis testing interpretation of differential privacy [78.32404878825845]
Reconstruction Robustness (ReRo) was recently proposed as an upper bound on the success of data reconstruction attacks against machine learning models. Previous research has demonstrated that differential privacy (DP) mechanisms also provide ReRo, but so far, only Monte Carlo estimates of a tight ReRo bound have been shown.
arXiv Detail & Related papers (2023-07-08T08:02:47Z)
Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk. We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations. We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z)
On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective [5.8010446129208155]
This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates. The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks.
arXiv Detail & Related papers (2023-04-26T21:43:42Z)
Factual Consistency Oriented Speech Recognition [23.754107608608106]
The proposed framework optimize the ASR model to maximize an expected factual consistency score between ASR hypotheses and ground-truth transcriptions. It is shown that training the ASR models with the proposed framework improves the speech summarization quality as measured by the factual consistency of meeting conversation summaries.
arXiv Detail & Related papers (2023-02-24T00:01:41Z)
Improved Policy Evaluation for Randomized Trials of Algorithmic Resource Allocation [54.72195809248172]
We present a new estimator leveraging our proposed novel concept, that involves retrospective reshuffling of participants across experimental arms at the end of an RCT. We prove theoretically that such an estimator is more accurate than common estimators based on sample means.
arXiv Detail & Related papers (2023-02-06T05:17:22Z)
Interpretable Research Replication Prediction via Variational Contextual Consistency Sentence Masking [14.50690911709558]
Research Replication Prediction (RRP) is the task of predicting whether a published research result can be replicated or not. In this work, we propose the Variational Contextual Consistency Sentence Masking (VCCSM) method to automatically extract key sentences. Results of our experiments on RRP along with European Convention of Human Rights (ECHR) datasets demonstrate that VCCSM is able to improve the model interpretability for the long document classification tasks.
arXiv Detail & Related papers (2022-03-28T03:27:13Z)
k-Rater Reliability: The Correct Unit of Reliability for Aggregated Human Annotations [2.538209532048867]
A proposed k-rater reliability (kRR) should be used as the correct data reliability for aggregated datasets. We present empirical, analytical, and bootstrap-based methods for computing kRR on WordSim-353.
arXiv Detail & Related papers (2022-03-24T08:05:06Z)
Distributionally Robust Multi-Output Regression Ranking [3.9318191265352196]
We introduce a new listwise listwise learning-to-rank model called Distributionally Robust Multi-output Regression Ranking (DRMRR) DRMRR uses a Distributionally Robust Optimization framework to minimize a multi-output loss function under the most adverse distributions in the neighborhood of the empirical data distribution. Our experiments were conducted on two real-world applications, medical document retrieval, and drug response prediction.
arXiv Detail & Related papers (2021-09-27T05:19:27Z)
Federated Distributionally Robust Optimization for Phase Configuration of RISs [106.4688072667105]
We study the problem of robust reconfigurable intelligent surface (RIS)-aided downlink communication over heterogeneous RIS types in a supervised learning setting. By modeling downlink communication over heterogeneous RIS designs as different workers that learn how to optimize phase configurations in a distributed manner, we solve this distributed learning problem. Our proposed algorithm requires fewer communication rounds to achieve the same worst-case distribution test accuracy compared to competitive baselines.
arXiv Detail & Related papers (2021-08-20T07:07:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.