Interpretable Directed Diversity: Leveraging Model Explanations for
Iterative Crowd Ideation
- URL: http://arxiv.org/abs/2109.10149v1
- Date: Tue, 21 Sep 2021 13:01:05 GMT
- Title: Interpretable Directed Diversity: Leveraging Model Explanations for
Iterative Crowd Ideation
- Authors: Yunlong Wang, Priyadarshini Venkatesh, Brian Y. Lim
- Abstract summary: We propose Interpretable Directed Diversity to automatically predict ideation quality and diversity scores.
These explanations provide multi-faceted feedback as users iteratively improve their ideation.
Users appreciated that explanation feedback helped focus their efforts and provided directions for improvement.
- Score: 7.341493082311333
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feedback can help crowdworkers to improve their ideations. However, current
feedback methods require human assessment from facilitators or peers. This is
not scalable to large crowds. We propose Interpretable Directed Diversity to
automatically predict ideation quality and diversity scores, and provide AI
explanations - Attribution, Contrastive Attribution, and Counterfactual
Suggestions - for deeper feedback on why ideations were scored (low), and how
to get higher scores. These explanations provide multi-faceted feedback as
users iteratively improve their ideation. We conducted think aloud and
controlled user studies to understand how various explanations are used, and
evaluated whether explanations improve ideation diversity and quality. Users
appreciated that explanation feedback helped focus their efforts and provided
directions for improvement. This resulted in explanations improving diversity
compared to no feedback or feedback with predictions only. Hence, our approach
opens opportunities for explainable AI towards scalable and rich feedback for
iterative crowd ideation.
Related papers
- Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation [67.88747330066049]
Fine-grained feedback captures nuanced distinctions in image quality and prompt-alignment.
We show that demonstrating its superiority to coarse-grained feedback is not automatic.
We identify key challenges in eliciting and utilizing fine-grained feedback.
arXiv Detail & Related papers (2024-06-24T17:19:34Z) - Aligning Large Language Models from Self-Reference AI Feedback with one General Principle [61.105703857868775]
We propose a self-reference-based AI feedback framework that enables a 13B Llama2-Chat to provide high-quality feedback.
Specifically, we allow the AI to first respond to the user's instructions, then generate criticism of other answers based on its own response as a reference.
Finally, we determine which answer better fits human preferences according to the criticism.
arXiv Detail & Related papers (2024-06-17T03:51:46Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - Counterfactuals of Counterfactuals: a back-translation-inspired approach
to analyse counterfactual editors [3.4253416336476246]
We focus on the analysis of counterfactual, contrastive explanations.
We propose a new back translation-inspired evaluation methodology.
We show that by iteratively feeding the counterfactual to the explainer we can obtain valuable insights into the behaviour of both the predictor and the explainer models.
arXiv Detail & Related papers (2023-05-26T16:04:28Z) - Continually Improving Extractive QA via Human Feedback [59.49549491725224]
We study continually improving an extractive question answering (QA) system via human user feedback.
We conduct experiments involving thousands of user interactions under diverse setups to broaden the understanding of learning from feedback over time.
arXiv Detail & Related papers (2023-05-21T14:35:32Z) - Helpful, Misleading or Confusing: How Humans Perceive Fundamental
Building Blocks of Artificial Intelligence Explanations [11.667611038005552]
We take a step back from sophisticated predictive algorithms and look into explainability of simple decision-making models.
We aim to assess how people perceive comprehensibility of their different representations.
This allows us to capture how diverse stakeholders judge intelligibility of fundamental concepts that more elaborate artificial intelligence explanations are built from.
arXiv Detail & Related papers (2023-03-02T03:15:35Z) - Selective Explanations: Leveraging Human Input to Align Explainable AI [40.33998268146951]
We propose a general framework for generating selective explanations by leveraging human input on a small sample.
As a showcase, we use a decision-support task to explore selective explanations based on what the decision-maker would consider relevant to the decision task.
Our experiments demonstrate the promise of selective explanations in reducing over-reliance on AI.
arXiv Detail & Related papers (2023-01-23T19:00:02Z) - Human Evaluation of Spoken vs. Visual Explanations for Open-Domain QA [22.76153284711981]
We study whether explanations help users correctly decide when to accept or reject an ODQA system's answer.
Our results show that explanations derived from retrieved evidence passages can outperform strong baselines (calibrated confidence) across modalities.
We show common failure cases of current explanations, emphasize end-to-end evaluation of explanations, and caution against evaluating them in proxy modalities that are different from deployment.
arXiv Detail & Related papers (2020-12-30T08:19:02Z) - Evaluating Explanations: How much do explanations from the teacher aid
students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning.
Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.