Related papers: Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits

Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits

URL: http://arxiv.org/abs/2210.00025v3
Date: Wed, 09 Oct 2024 21:48:01 GMT
Title: Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Authors: Siddhartha Banerjee, Sean R. Sinclair, Milind Tambe, Lily Xu, Christina Lee Yu,
Abstract summary: We propose Artificial-Replay, a meta-algorithm for incorporating historical data into arbitrary base bandit algorithms. We show that Artificial-Replay uses only a fraction of the historical data compared to a full warm-start approach.
Score: 34.42192958753171
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most real-world deployments of bandit algorithms exist somewhere in between the offline and online set-up, where some historical data is available upfront and additional data is collected dynamically online. How best to incorporate historical data to "warm start" bandit algorithms is an open question: naively initializing reward estimates using all historical samples can suffer from spurious data and imbalanced data coverage, leading to computation and storage issues-particularly for continuous action spaces. To address these challenges, we propose Artificial-Replay, a meta-algorithm for incorporating historical data into any arbitrary base bandit algorithm. We show that Artificial-Replay uses only a fraction of the historical data compared to a full warm-start approach, while still achieving identical regret for base algorithms that satisfy independence of irrelevant data (IIData), a novel and broadly applicable property that we introduce. We complement these theoretical results with experiments on (i) K-armed bandits and (ii) continuous combinatorial bandits, on which we model green security domains using real poaching data. Our results show the practical benefits of Artificial-Replayin reducing computation and space complexity, including for base algorithms that do not satisfy IIData.

Related papers

Neural Dueling Bandits [58.90189511247936]
We use a neural network to estimate the reward function using preference feedback for the previously selected arms. We then extend our theoretical results to contextual bandit problems with binary feedback, which is in itself a non-trivial contribution.
arXiv Detail & Related papers (2024-07-24T09:23:22Z)
From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search [30.88999109835329]
In text-based person search endeavors, data generation has emerged as a prevailing practice, addressing concerns over privacy preservation and the arduous task of manual annotation. We observe that only a subset of the data in constructed datasets plays a decisive role. We introduce a new Filtering-WoRA paradigm, which contains a filtering algorithm to identify this crucial data subset and WoRA learning strategy for light fine-tuning.
arXiv Detail & Related papers (2024-04-16T05:29:14Z)
Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain. We propose an adversarial algorithm to make the retriever component robust against distribution shift. We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z)
Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks. To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks. However, it is not expected in practice considering the memory constraint or data privacy issue. As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z)
Differentially Private Synthetic Data Using KD-Trees [11.96971298978997]
We exploit space partitioning techniques together with noise perturbation and thus achieve intuitive and transparent algorithms. We propose both data independent and data dependent algorithms for $epsilon$-differentially private synthetic data generation. We show empirical utility improvements over the prior work, and discuss performance of our algorithm on a downstream classification task on a real dataset.
arXiv Detail & Related papers (2023-06-19T17:08:32Z)
Performance Evaluation and Comparison of a New Regression Algorithm [4.125187280299247]
We compare the performance of a newly proposed regression algorithm against four conventional machine learning algorithms. The reader is free to replicate our results since we have provided the source code in a GitHub repository.
arXiv Detail & Related papers (2023-06-15T13:01:16Z)
Data pruning and neural scaling laws: fundamental limitations of score-based algorithms [9.68145635795782]
We show theoretically and empirically why score-based data pruning algorithms fail in the high compression regime. We present calibration protocols that enhance the performance of existing pruning algorithms in this high compression regime.
arXiv Detail & Related papers (2023-02-14T10:38:40Z)
Shuffled linear regression through graduated convex relaxation [12.614901374282868]
The shuffled linear regression problem aims to recover linear relationships in datasets where the correspondence between input and output is unknown. This problem arises in a wide range of applications including survey data. We propose a novel optimization algorithm for shuffled linear regression based on a posterior-maximizing objective function.
arXiv Detail & Related papers (2022-09-30T17:33:48Z)
Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay [52.251188477192336]
Few-shot class-incremental learning (FSCIL) has been proposed aiming to enable a deep learning system to incrementally learn new classes with limited data. We show through empirical results that adopting the data replay is surprisingly favorable. We propose using data-free replay that can synthesize data by a generator without accessing real data.
arXiv Detail & Related papers (2022-07-22T17:30:51Z)
Toeplitz Least Squares Problems, Fast Algorithms and Big Data [1.3535770763481905]
Two recent algorithms have applied randomized numerical linear algebra techniques to fitting an autoregressive model to big time-series data. We investigate and compare the quality of these two approximation algorithms on large-scale synthetic and real-world data. While both algorithms display comparable results for synthetic datasets, the LSAR algorithm appears to be more robust when applied to real-world time series data.
arXiv Detail & Related papers (2021-12-24T08:32:09Z)
Continual Learning for Fake Audio Detection [62.54860236190694]
This paper proposes detecting fake without forgetting, a continual-learning-based method, to make the model learn new spoofing attacks incrementally. Experiments are conducted on the ASVspoof 2019 dataset.
arXiv Detail & Related papers (2021-04-15T07:57:05Z)
Bandits with Partially Observable Confounded Data [74.04376842070624]
We show that this problem is closely related to a variant of the bandit problem with side information. We construct a linear bandit algorithm that takes advantage of the projected information, and prove regret bounds. Our results indicate that confounded offline data can significantly improve online learning algorithms.
arXiv Detail & Related papers (2020-06-11T18:48:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.