Blessing from Human-AI Interaction: Super Reinforcement Learning in
Confounded Environments
- URL: http://arxiv.org/abs/2209.15448v2
- Date: Sat, 21 Oct 2023 01:58:39 GMT
- Title: Blessing from Human-AI Interaction: Super Reinforcement Learning in
Confounded Environments
- Authors: Jiayi Wang, Zhengling Qi, Chengchun Shi
- Abstract summary: We introduce the paradigm of super reinforcement learning that takes advantage of Human-AI interaction for data driven sequential decision making.
In the decision process with unmeasured confounding, the actions taken by past agents can offer valuable insights into undisclosed information.
We develop several super-policy learning algorithms and systematically study their theoretical properties.
- Score: 19.944163846660498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As AI becomes more prevalent throughout society, effective methods of
integrating humans and AI systems that leverage their respective strengths and
mitigate risk have become an important priority. In this paper, we introduce
the paradigm of super reinforcement learning that takes advantage of Human-AI
interaction for data driven sequential decision making. This approach utilizes
the observed action, either from AI or humans, as input for achieving a
stronger oracle in policy learning for the decision maker (humans or AI). In
the decision process with unmeasured confounding, the actions taken by past
agents can offer valuable insights into undisclosed information. By including
this information for the policy search in a novel and legitimate manner, the
proposed super reinforcement learning will yield a super-policy that is
guaranteed to outperform both the standard optimal policy and the behavior one
(e.g., past agents' actions). We call this stronger oracle a blessing from
human-AI interaction. Furthermore, to address the issue of unmeasured
confounding in finding super-policies using the batch data, a number of
nonparametric and causal identifications are established. Building upon on
these novel identification results, we develop several super-policy learning
algorithms and systematically study their theoretical properties such as
finite-sample regret guarantee. Finally, we illustrate the effectiveness of our
proposal through extensive simulations and real-world applications.
Related papers
- Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Attaining Human`s Desirable Outcomes in Human-AI Interaction via Structural Causal Games [34.34801907296059]
In human-AI interaction, a prominent goal is to attain humans desirable outcome with the assistance of AI agents.
We employ a theoretical framework called structural causal game (SCG) to formalize the human-AI interactive process.
We introduce a strategy referred to as pre-policy intervention on the SCG to steer AI agents towards attaining the humans desirable outcome.
arXiv Detail & Related papers (2024-05-26T14:42:49Z) - Human-AI Safety: A Descendant of Generative AI and Control Systems Safety [6.100304850888953]
We argue that meaningful safety assurances for advanced AI technologies require reasoning about how the feedback loop formed by AI outputs and human behavior may drive the interaction towards different outcomes.
We propose a concrete technical roadmap towards next-generation human-centered AI safety.
arXiv Detail & Related papers (2024-05-16T03:52:00Z) - Optimising Human-AI Collaboration by Learning Convincing Explanations [62.81395661556852]
We propose a method for a collaborative system that remains safe by having a human making decisions.
Ardent enables efficient and effective decision-making by adapting to individual preferences for explanations.
arXiv Detail & Related papers (2023-11-13T16:00:16Z) - Exploration with Principles for Diverse AI Supervision [88.61687950039662]
Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.
While this generative AI approach has produced impressive results, it heavily leans on human supervision.
This strong reliance on human oversight poses a significant hurdle to the advancement of AI innovation.
We propose a novel paradigm termed Exploratory AI (EAI) aimed at autonomously generating high-quality training data.
arXiv Detail & Related papers (2023-10-13T07:03:39Z) - Learning to Make Adherence-Aware Advice [8.419688203654948]
This paper presents a sequential decision-making model that takes into account the human's adherence level.
We provide learning algorithms that learn the optimal advice policy and make advice only at critical time stamps.
arXiv Detail & Related papers (2023-10-01T23:15:55Z) - From DDMs to DNNs: Using process data and models of decision-making to
improve human-AI interactions [1.1510009152620668]
We argue that artificial intelligence (AI) research would benefit from a stronger focus on insights about how decisions emerge over time.
First, we introduce a highly established computational framework that assumes decisions to emerge from the noisy accumulation of evidence.
Next, we discuss to what extent current approaches in multi-agent AI do or do not incorporate process data and models of decision making.
arXiv Detail & Related papers (2023-08-29T11:27:22Z) - Fairness in AI and Its Long-Term Implications on Society [68.8204255655161]
We take a closer look at AI fairness and analyze how lack of AI fairness can lead to deepening of biases over time.
We discuss how biased models can lead to more negative real-world outcomes for certain groups.
If the issues persist, they could be reinforced by interactions with other risks and have severe implications on society in the form of social unrest.
arXiv Detail & Related papers (2023-04-16T11:22:59Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Learning Complementary Policies for Human-AI Teams [22.13683008398939]
We propose a framework for a novel human-AI collaboration for selecting advantageous course of action.
Our solution aims to exploit the human-AI complementarity to maximize decision rewards.
arXiv Detail & Related papers (2023-02-06T17:22:18Z) - Flexible Attention-Based Multi-Policy Fusion for Efficient Deep
Reinforcement Learning [78.31888150539258]
Reinforcement learning (RL) agents have long sought to approach the efficiency of human learning.
Prior studies in RL have incorporated external knowledge policies to help agents improve sample efficiency.
We present Knowledge-Grounded RL (KGRL), an RL paradigm fusing multiple knowledge policies and aiming for human-like efficiency and flexibility.
arXiv Detail & Related papers (2022-10-07T17:56:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.