Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
- URL: http://arxiv.org/abs/2503.00273v1
- Date: Sat, 01 Mar 2025 01:01:41 GMT
- Title: Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
- Authors: Yuzhou Gu, Yanjun Han, Jian Qian,
- Abstract summary: We study the evolution of information in interactive decision making through the lens of a multi-armed bandit problem.<n>We show that optimal success probability and mutual information can be decoupled, where achieving optimal learning does not necessarily require maximizing information gain.
- Score: 16.584981298202223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the evolution of information in interactive decision making through the lens of a stochastic multi-armed bandit problem. Focusing on a fundamental example where a unique optimal arm outperforms the rest by a fixed margin, we characterize the optimal success probability and mutual information over time. Our findings reveal distinct growth phases in mutual information -- initially linear, transitioning to quadratic, and finally returning to linear -- highlighting curious behavioral differences between interactive and non-interactive environments. In particular, we show that optimal success probability and mutual information can be decoupled, where achieving optimal learning does not necessarily require maximizing information gain. These findings shed new light on the intricate interplay between information and learning in interactive decision making.
Related papers
- InterFormer: Towards Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction [72.50606292994341]
We propose a novel module named InterFormer to learn heterogeneous information interaction in an interleaving style.<n>Our proposed InterFormer achieves state-of-the-art performance on three public datasets and a large-scale industrial dataset.
arXiv Detail & Related papers (2024-11-15T00:20:36Z) - Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion.
In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation.
We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z) - Generative Intrinsic Optimization: Intrinsic Control with Model Learning [5.439020425819001]
Future sequence represents the outcome after executing the action into the environment.
Explicit outcomes may vary across state, return, or trajectory serving different purposes such as credit assignment or imitation learning.
We propose a policy scheme that seamlessly incorporates the mutual information, ensuring convergence to the optimal policy.
arXiv Detail & Related papers (2023-10-12T07:50:37Z) - Learning Unseen Modality Interaction [54.23533023883659]
Multimodal learning assumes all modality combinations of interest are available during training to learn cross-modal correspondences.
We pose the problem of unseen modality interaction and introduce a first solution.
It exploits a module that projects the multidimensional features of different modalities into a common space with rich information preserved.
arXiv Detail & Related papers (2023-06-22T10:53:10Z) - Mutual Information Estimation via $f$-Divergence and Data Derangements [6.43826005042477]
We propose a novel class of discrimi mutual information estimators based on the variational representation of the $f$-divergence.
The proposed estimator is flexible since it exhibits an excellent bias/ variance trade-off.
arXiv Detail & Related papers (2023-05-31T16:54:25Z) - Selective Inference for Sparse Multitask Regression with Applications in
Neuroimaging [2.611153304251067]
We propose a framework for selective inference to address a common multi-task problem in neuroimaging.
Our framework offers a new conditional procedure for inference, based on a refinement of the selection event that yields a tractable selection-adjusted likelihood.
We demonstrate through simulations that multi-task learning with selective inference can more accurately recover true signals than single-task methods.
arXiv Detail & Related papers (2022-05-27T20:21:20Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Adversarial Mutual Information for Text Generation [62.974883143784616]
We propose Adversarial Mutual Information (AMI): a text generation framework.
AMI is formed as a novel saddle point (min-max) optimization aiming to identify joint interactions between the source and target.
We show that AMI has potential to lead to a tighter lower bound of maximum mutual information.
arXiv Detail & Related papers (2020-06-30T19:11:51Z) - Mutual Information Maximization for Effective Lip Reading [99.11600901751673]
We propose to introduce the mutual information constraints on both the local feature's level and the global sequence's level.
By combining these two advantages together, the proposed method is expected to be both discriminative and robust for effective lip reading.
arXiv Detail & Related papers (2020-03-13T18:47:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.