When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming
- URL: http://arxiv.org/abs/2306.04930v3
- Date: Mon, 22 Apr 2024 04:12:44 GMT
- Title: When to Show a Suggestion? Integrating Human Feedback in AI-Assisted Programming
- Authors: Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz,
- Abstract summary: We pursue mechanisms for leveraging signals about programmers' acceptance and rejection of code suggestions to guide recommendations.
We introduce a utility-theoretic framework to drive decisions about suggestions to display versus withhold.
conditional suggestion display from human feedback relies on a cascade of models that provide the likelihood that recommended code will be accepted.
- Score: 28.254978977288868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI powered code-recommendation systems, such as Copilot and CodeWhisperer, provide code suggestions inside a programmer's environment (e.g., an IDE) with the aim of improving productivity. We pursue mechanisms for leveraging signals about programmers' acceptance and rejection of code suggestions to guide recommendations. We harness data drawn from interactions with GitHub Copilot, a system used by millions of programmers, to develop interventions that can save time for programmers. We introduce a utility-theoretic framework to drive decisions about suggestions to display versus withhold. The approach, conditional suggestion display from human feedback (CDHF), relies on a cascade of models that provide the likelihood that recommended code will be accepted. These likelihoods are used to selectively hide suggestions, reducing both latency and programmer verification time. Using data from 535 programmers, we perform a retrospective evaluation of CDHF and show that we can avoid displaying a significant fraction of suggestions that would have been rejected. We further demonstrate the importance of incorporating the programmer's latent unobserved state in decisions about when to display suggestions through an ablation study. Finally, we showcase how using suggestion acceptance as a reward signal for guiding the display of suggestions can lead to suggestions of reduced quality, indicating an unexpected pitfall.
Related papers
- Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation [67.88747330066049]
Fine-grained feedback captures nuanced distinctions in image quality and prompt-alignment.
We show that demonstrating its superiority to coarse-grained feedback is not automatic.
We identify key challenges in eliciting and utilizing fine-grained feedback.
arXiv Detail & Related papers (2024-06-24T17:19:34Z) - De-fine: Decomposing and Refining Visual Programs with Auto-Feedback [75.62712247421146]
De-fine is a training-free framework that decomposes complex tasks into simpler subtasks and refines programs through auto-feedback.
Our experiments across various visual tasks show that De-fine creates more robust programs.
arXiv Detail & Related papers (2023-11-21T06:24:09Z) - MISSRec: Pre-training and Transferring Multi-modal Interest-aware
Sequence Representation for Recommendation [61.45986275328629]
We propose MISSRec, a multi-modal pre-training and transfer learning framework for sequential recommendation.
On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests.
On the candidate item side, we adopt a dynamic fusion module to produce user-adaptive item representation.
arXiv Detail & Related papers (2023-08-22T04:06:56Z) - Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming [28.254978977288868]
We studied GitHub Copilot, a code-recommendation system used by millions of programmers daily.
We developed CUPS, a taxonomy of common programmer activities when interacting with Copilot.
Our insights reveal how programmers interact with Copilot and motivate new interface designs and metrics.
arXiv Detail & Related papers (2022-10-25T20:01:15Z) - Breaking Feedback Loops in Recommender Systems with Causal Inference [99.22185950608838]
Recent work has shown that feedback loops may compromise recommendation quality and homogenize user behavior.
We propose the Causal Adjustment for Feedback Loops (CAFL), an algorithm that provably breaks feedback loops using causal inference.
We show that CAFL improves recommendation quality when compared to prior correction methods.
arXiv Detail & Related papers (2022-07-04T17:58:39Z) - ELIXIR: Learning from User Feedback on Explanations to Improve
Recommender Models [26.11434743591804]
We devise a human-in-the-loop framework, called ELIXIR, where user feedback on explanations is leveraged for pairwise learning of user preferences.
ELIXIR leverages feedback on pairs of recommendations and explanations to learn user-specific latent preference vectors.
Our framework is instantiated using generalized graph recommendation via Random Walk with Restart.
arXiv Detail & Related papers (2021-02-15T13:43:49Z) - Adversarial Counterfactual Learning and Evaluation for Recommender
System [33.44276155380476]
We show in theory that applying supervised learning to detect user preferences may end up with inconsistent results in the absence of exposure information.
We propose a principled solution by introducing a minimax empirical risk formulation.
arXiv Detail & Related papers (2020-11-08T00:40:51Z) - Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks.
Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL.
Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z) - Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups.
We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users.
We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.