Counterfactual (CF) explanations have been employed as one of the modes of
explainability in explainable AI-both to increase the transparency of AI
systems and to provide recourse. Cognitive science and psychology, however,
have pointed out that people regularly use CFs to express causal relationships.
Most AI systems are only able to capture associations or correlations in data
so interpreting them as casual would not be justified. In this paper, we
present two experiment (total N = 364) exploring the effects of CF explanations
of AI system's predictions on lay people's causal beliefs about the real world.
In Experiment 1 we found that providing CF explanations of an AI system's
predictions does indeed (unjustifiably) affect people's causal beliefs
regarding factors/features the AI uses and that people are more likely to view
them as causal factors in the real world. Inspired by the literature on
misinformation and health warning messaging, Experiment 2 tested whether we can
correct for the unjustified change in causal beliefs. We found that pointing
out that AI systems capture correlations and not necessarily causal
relationships can attenuate the effects of CF explanations on people's causal
beliefs.
Can counterfactual explanations of AI systems’ predictions skew lay users’ causal
aiシステムの予測は、ユーザの因果関係を歪めている。
0.66
intuitions about the world? If so, can we correct for that?
世界の直感? もしそうなら、それを修正できますか。
0.57
Marko Teˇsi´c , Ulrike Hahn Birkbeck, University of London {m.tesic, u.hahn}@bbk.ac.uk
ロンドン大学バークベック校(英語:Ulrike Hahn Birkbeck, University of London)(英語)
0.65
2 2 0 2 y a M 2 1
2 2 0 2 y a m 2 1 である。
0.52
] I A . s c [
【私】 A! sc [
0.50
1 v 1 4 2 6 0
1 v 1 4 2 6 0
0.43
. 5 0 2 2 : v i X r a
. 5 0 2 2 : v i X r a
0.42
Abstract Counterfactual (CF) explanations have been employed as one of the modes of explainability in explainable AI—both to increase the transparency of AI systems and to provide recourse.
In this paper, we present two experiment (total N = 364) exploring the effects of CF explanations of AI system’s predictions on lay people’s causal beliefs about the real world.
本稿では,2つの実験(total n = 364)を行い,現実世界に対する一般人の因果的信念に対する,aiシステムの予測のcfによる説明の効果について検討する。
0.78
In Experiment 1 we found that providing CF explanations of an AI system’s predictions does indeed (unjustifiably) affect people’s causal beliefs regarding factors/features the AI uses and that people are more likely to view them as causal factors in the real world.
Inspired by the literature on misinformation and health warning messaging, Experiment 2 tested whether we can correct for the unjustified change in causal beliefs.
We found that pointing out that AI systems capture correlations and not necessarily causal relationships can attenuate the effects of CF explanations on people’s causal beliefs.
1 Introduction Interest in automatically-genera ted explanations for predictive AI systems has grown considerably in recent years [DARPA, 2016; Doshi-Velez and Kim, 2017; Gunning and Aha, 2019; Montavon et al , 2018; Rieger et al , 2018; Samek et al , 2017].
1 予測aiシステムのための自動生成説明への関心は近年大幅に高まっている(darpa, 2016, doshi-velez and kim, 2017), gunning and aha, 2019; montavon et al , 2018; rieger et al , 2018; samek et al , 2017)。 訳抜け防止モード: 1 自動AIシステムにおける導入関心 - 予測AIシステムに関する説明は近年大きく成長している[DARPA]。 2016年 - Doshi - VelezとKim, 2017年; Gunning そしてAha, 2019; Montavon et al, 2018; Rieger et al。 2018年、Samek et al , 2017 ]
0.86
It is argued that explanations provide transparency for what are often black-box procedures and transparency is viewed as critical for fostering the acceptance of AI systems in real-world practice [Bansal et al , 2014; Chen et al , 2014; Fallon and Blaha, 2018; Hayes and Shah, 2017; Mercado et al , 2016; Wachter et al , 2017].
Bansal et al , 2014; Chen et al , 2014; Fallon and Blaha, 2018; Hayes and Shah, 2017; Mercado et al , 2016; Wachter et al , 2017] は、現実のプラクティスにおけるAIシステムの受容を促進する上で、説明がしばしばブラックボックスの手順であるものに対して透明性を提供する、と論じられている。
0.86
Explainable AI (XAI) has emerged as a field to address this need for AI systems’ predictions to be followed by explanations of these predictions.
[Lundberg and Lee, 2017; Ribeiro et al , 2016], saliency maps [Simonyan et al , 2013], example-based methods [Koh and Liang, 2017], and counterfactuals [Karimi et al , 2020a; Poyiadzi et al , 2020; Wachter et al , 2017].
[Lundberg and Lee, 2017; Ribeiro et al , 2016, saliency map [Simonyan et al , 2013], example-based method [Koh and Liang, 2017], counterfactuals [Karimi et al , 2020a; Poyiadzi et al , 2020; Wachter et al , 2017]
0.40
In this paper we focus on counterfactual (CF) explanations of specific predictions of AI systems.
本稿では,AIシステムの特定の予測に関する対実的(CF)説明に焦点を当てる。
0.75
CFs address questions such as ‘Why A rather than B?’; for example, ‘Why did the AI system deny the loan rather than approve it?’.
CF explanations not only provide us with an insight into why an AI system made a certain prediction (‘deny the loan’), but also what a user can do in order to flip the prediction (‘offer a loan’).
For example, taking painkillers can have side effects such as fatigue.
例えば、鎮痛剤の摂取には疲労などの副作用がある。
0.72
People would judge that a runner who sprained an ankle and took painkiller A which has fatigue as one of its side effects to have caused poor performance and loss of the race when they are aware of an alternative painkiller B without side effects.
However, when painkiller B also leads to side effects, people judge painkiller A to have less causal impact on the race outcome: even if the runner had taken B, she still would have had side effects [McCloy and Byrne, 2002].
しかし、ペインキラーBが副作用をもたらす場合、人々はペインキラーAがレース結果に因果的影響が少ないと判断する:もしランナーがBを奪ったとしても、彼女は副作用を持っていただろう [McCloy and Byrne, 2002]。
0.74
AI systems are typically predictive in nature and are capturing associations and correlations in data, not causal processes that generated the data.
More specifically, in most applications of AI systems we use data X and Y to estimate a function f, which in turn is used to generate predictions ˆY for new instances.
具体的には、AIシステムのほとんどのアプリケーションでは、関数fを推定するためにデータXとYを使用します。 訳抜け防止モード: 具体的には、AIシステムのほとんどのアプリケーションでは、データXとYを使用します。 関数 f を推定し、これは新しいインスタンスに対する予測 >Y を生成するために使われる。
0.68
No underlying theoretical causal model for function f is assumed.
関数 f の理論的因果モデルが仮定されない。
0.82
Moreover, f is not expected to adequately capture the underlying (causal) processes or real-world mechanisms that generated the data used for training and estimation.
It is thus entirely possible that explanations for predictions ˆY that comprise of (changes) in features X have no clear causal connection (when, for example, X contains heavily engineered features) or have an anti-causal relationship, where Y is a cause of some X. Furthermore, due to regularization it is possible that some of the really causal X are left out or their
したがって、特徴 X の(変化)を構成する予想に対する説明が、明確な因果関係を持たない(例えば、X は、高度にエンジニアリングされた特徴を含む)ことや、Y が X の原因となる反因果関係を持つことは、完全に可能である。さらに、正規化のため、真の因果関係 X の一部が外されたり、それらが外されたりする可能性がある。 訳抜け防止モード: したがって、特徴 X における (変化) を構成する予想 >Y に対する説明が明確な因果関係(いつ、いつ)を持たないことは、完全に可能である。 例えば、X は高度にエンジニアリングされた特徴を含むか、または反因果関係を持つ。 ここで Y が X の原因となる。 真の因果Xのいくつかは
0.82
英語(論文から抽出)
日本語訳
スコア
impact on estimating ˆY is reduced [Del Giudice, 2021].
Y の推定への影響は減少する[Del Giudice, 2021]
0.80
One should then be careful when using AI systems and explanations of their predictions not to misinterpret AI systems in a causal manner and be wary of their limits [Dillon et al , 2021; Molnar et al , 2020].
その上で、aiシステムの使用と予測の説明は、因果的な方法でaiシステムを誤解しないようにし、その限界に注意する必要がある(dillon et al , 2021; molnar et al , 2020)。
0.72
This, however, may be easier said then done, particularly in the case of CF explanations of AI systems’ predictions.
しかし、特にaiシステムの予測に関するcfによる説明の場合、これは言うのが簡単になるかもしれない。
0.64
If people naturally associate CF with causal reasoning they may be especially prone to slipping into causal interpretations of AI systems results when they are presented with CF explanations.
The first investigated whether lay people are more likely to form causal beliefs about the factors/features AI systems are using when these are presented with CF explanations.
The second experiment, explores a possible means to prevent lay people from forming inadvertant causal beliefs due to CF explanations.
第2の実験では、CFの説明による不注意な因果的信念の形成を防止する手段を探究した。
0.66
2 Experiment 1 The aim of this experiment was to explore lay people’s causal beliefs after having received a prediction made by an AI system, which is then supplemented with a CF explanation.
More specifically, we hypothesise that people will erroneously hold beliefs that the features an AI has used to make predictions are more causal when a CF explanation of the AI system’s prediction is provided, compared to when the prediction of an AI system is presented without a CF explanation, and compared to a baseline (where no AI system or its predictions are mentioned).
As AI systems are predictive in nature, one might argue that the above hypothesised effect may be due to lay people conflating the prediction/predictiv e power of AI systems with causation.
The second hypothesis is aimed at testing this possibility.
2つ目の仮説は、この可能性をテストすることを目的としている。
0.48
More specifically, we hypothesize that knowing an AI system is using certain feature A to predict label B and knowing what the predictions is will change people’s expectation as to how good a predictor feature A is with respect B, compared to the baseline.
Crucially, however, we hypothesize that additionally knowing an explanation for that prediction will not further change people’s expectation as to how good a predictor A is.
This finding would imply that any change in causal beliefs would be due to the presence of a CF explanation of AI predictions and cannot be accounted for by a change in expectation of how good the features the AI system uses are in predicting the label.
All participants were native English speakers residing in the UK or Ireland whose approval ratings were 95% or higher.
すべての参加者はイギリスまたはアイルランドに住む英語話者であり、承認率は95%以上であった。
0.78
They gave informed consent and were paid £6.24 an hour rate for partaking in the study, which took on average 8.1 min to complete.
彼らは情報提供の同意を得て、1時間あたり6.24ポンドを支払ったが、これは平均8.1分で完了した。
0.59
Design Participants were randomly assigned to one of three between-participant groups: the Control/baseline group where participants were asked about their intuitions regarding how certain factors/features influence salary without mentioning AI systems or explanations of AI systems (N = 30); the AI Prediction group where participants were told about the AI system and the features it uses as well as what the prediction is (N = 31); and, the AI Explanation group where they were told about AI system, features, what prediction is, and received a CF explanation of the prediction for each feature (N = 32).
Design Participants were randomly assigned to one of three between-participant groups: the Control/baseline group where participants were asked about their intuitions regarding how certain factors/features influence salary without mentioning AI systems or explanations of AI systems (N = 30); the AI Prediction group where participants were told about the AI system and the features it uses as well as what the prediction is (N = 31); and, the AI Explanation group where they were told about AI system, features, what prediction is, and received a CF explanation of the prediction for each feature (N = 32). 訳抜け防止モード: デザイン参加者は3つのグループのうちの1つにランダムに割り当てられた : 参加者がAIシステムに言及することなく、特定の要因/特徴が給与に与える影響について直感について質問されたコントロール/ベースライングループ あるいはAIシステムの説明 (N = 30 ) ; 参加者がAIシステムについて話すAI予測グループ そして、それが使用する特徴と予測が何か (N = 31 ) ; そして、AIシステムについて言われたAI説明グループ。 特徴、予測とは何か、そして各特徴(N = 32 )の予測に関するCF説明を受けた。
0.78
The experiment had three dependent variables: Expectation, Confidence, and Action.
実験には期待、自信、行動という3つの依存変数があった。
0.66
Expectation dependent variable measured people’s beliefs regarding how well features predict the label.
期待依存変数は、機能がどのようにラベルを予測できるかに関する人々の信念を測定した。
0.58
Confidence dependent variable measured how confident participants were in their expectation estimates.
信頼度依存変数は、期待値にどの程度自信があるかを測定した。
0.57
The main reason for including Confidence dependent variable was to disentangle between people’s beliefs about the predictive power of the features and their confidence in how predictive they believe the features are.
Lastly, Action dependent variable measured people’s causal beliefs about the real world in terms of their willingness to act or recommend a certain action to be done in the real world.
All participants provided answers for each dependent variable.
すべての参加者は依存変数に対して回答を提供した。
0.56
Materials To test the hypotheses we used salary as a domain; it is reasonable to expect that most participants will have some familiarity regarding factors/features affecting salary and that they would already have developed certain intuitions about these factors.
We chose 9 factors/features that are to various extents intuitively related to higher/lower salary.
高い/低い給与と直感的に関連のある9つの要因/機能を選択した。
0.56
These were: education level, the sector the employee works in (private or public), the number of hours of sleep, whether or not the employee owns a smart watch, whether or not the employee owns an office plant, whether or not the employee gets expensive haircuts, whether or not the employee wears expensive clothes, whether or not the employee goes skiing multiple times a year, and whether or not the employee rents a penthouse apartment.
We aimed to have a range of factors/features whereby some are intuitively causing higher/lower salary (e g education level, sector), some are intuitively not relevant to salary (e g office plant, smart watch), and some are potential consequences or effects of higher salary rather than causing higher salary (e g expensive clothes, expensive haircuts, renting penthouse apartments).
Namely, we hoped that for some factors/features such as education level both Expectation estimates and Action estimates would be high (i.e. education level is a good predictor of salary and to increase their salary one might consider getting a higher
degree); some factors/features would have both Expectation and Action estimates very low (e g whether or not someone has an office plant does not seem to be related to salary and buying an office plant to increase salary would seem like a futile endeavour); lastly, some factors/features such as expensive clothes and renting a penthouse apartment would have higher Expectation estimates but lower Action estimates (that someone is renting a penthouse apartment may be an indicator that they have a high salary, but one would not presumably rent a penthouse apartment because they believe that would increase their salary).
degree); some factors/features would have both Expectation and Action estimates very low (e g whether or not someone has an office plant does not seem to be related to salary and buying an office plant to increase salary would seem like a futile endeavour); lastly, some factors/features such as expensive clothes and renting a penthouse apartment would have higher Expectation estimates but lower Action estimates (that someone is renting a penthouse apartment may be an indicator that they have a high salary, but one would not presumably rent a penthouse apartment because they believe that would increase their salary). 訳抜け防止モード: 程度 ) ; いくつかの要因や特徴は期待値とアクションの推定値の両方が非常に低い(例えば)。 オフィスの工場が 給料に関係していないとか そして、給与を上げるためにオフィス工場を買うことは、無駄な努力のように思える。 高級衣料品やペントハウスマンションの賃貸などの要因・特徴は期待値が高い しかし、低いアクション見積もり(誰かがペントハウスアパートを借りている)は、高い給与を持っていることを示す指標かもしれない。 しかし、ペントハウスのアパートを借りるのは、それが給与を増加させると信じているからではないだろう)。
0.66
The features/factors were not chosen from a specific data set, but were devised for the proposes of the experiment.
特徴/要素は特定のデータセットから選択されなかったが、実験の提案のために考案された。
0.76
All collected participant data and materials from both ex-
両者から収集した全参加者データと資料
0.80
periments are available via OSF.
PerimentsはOSF経由で利用できる。
0.68
1 Procedure After providing informed consent and basic demographic information, participants were shown a welcome message.
Participants then answered two preliminary questions: ‘How familiar are you with the factors that may affect salary?’ and ‘How familiar are you with the AI technology, e g AI systems that are able to make predictions?’.
Answers to both questions were on a 7-point Likert scale from ‘1 - Not at all familiar’ to ‘7 - Extremely familiar’.
両質問に対する回答は,‘1 - Not at all familiar’から‘7 - Extremely familiar’まで,7ポイントのQuatスケールだった。
0.83
The main motivation for including these questions was to check whether any differences among the three groups in the subsequent Expectation or Action estimates were due to differences in familiarity with the domain (salary) or familiarity with AI technology.
Following these two preliminary questions, participants saw a preamble for the specific group they were assigned to, i.e. Control, AI Prediction, AI Explanation (square brackets indicate which text was presented to which group):
Your good friend Tom is looking to increase his salary.
あなたの親友トムは給料を増やそうとしている。
0.77
He’s asked you for advice on how to best achieve that.
彼はそれに最善を尽くす方法のアドバイスを求めている。
0.67
[all three groups] There are a range of factors that are related to a higher salary.
[3つのグループすべて]高給に関連するさまざまな要因があります。
0.62
You will now consider some of these factors.
これらの要素を考慮に入れましょう。
0.51
[only the Control group] In your search for ways to help your friend you have found an AI system that can predict whether people’s yearly salaries are higher than/equal to £30k (≥ £30k) or lower than £30k (< £30k).
[AI Prediction and AI Explanation groups] The AI system now provides you with explanations with respect to each factor as to why it predicts that Tom’s salary is lower than £30k (< £30k).
The cutoff £30k was used as that figure was close to the median salary in the UK in 2020.
カットオフ30万ポンドは、2020年の英国における中央値の給与に近いものだった。
0.70
After participants read the preamble for the group they were assigned to, they proceeded to answer the three questions (Expectation, Confidence, Action) regarding 9 factors.
The order of factors/features was randomized for each participant.
因子/特徴の順序は各参加者にランダム化された。
0.61
Before answering the three questions, participants in the AI Prediction and AI Explanation groups were reminded of the AI system’s prediction and in the AI Explanation group people were additionally told the CF explanation for that factor.
[AI Prediction and AI Explanation groups] Factor: Education level [all three groups] If Tom had had an advanced degree Explanation: (e g masters), the AI system would have predicted that his salary was higher than/equal to £30k (≥ £30k).
Q. Would you expect that employees who have an advanced degree (e g masters) also have a higher salary?
q: 高度な学位(マスターなど)を持つ従業員には、より高い給与が与えられると期待できますか?
0.79
three groups] Please rate your answer from 0 (No, not at all) to 100 (Yes, absolutely).
3つのグループ] 答えを0(ノー、ノー)から100(イエス、絶対)に格付けしてください。
0.68
Q. How confident are you in your response?
Q. 回答の自信はどのくらいありますか。
0.72
[Confidence question, same for all three groups] Q. Assuming Tom has the resources (time, money, etc.), would you recommend he starts an advanced degree (e g masters) with the hope of increasing his salary?
[Action question, same for all three groups] Please rate your answer from 0 (not at all) to 100 (totally).
[Action question, same for all three groups]あなたの回答を0(全く)から100(トータル)に格付けしてください。
0.81
Participants’ responses to the three questions were elicited using a slider from 0 to 100 with 1 point increments.
3つの質問に対する参加者の回答は、0から100までのスライダーで1ポイントインクリメントした。
0.79
The three questions followed the same format for all other factors.
3つの質問は、他のすべての要因について同じフォーマットに従った。
0.57
The Action questions sometimes had a short caveat (‘Assuming Tom has the resources . . . ’) as shown above to guard against participants drifting into a cost-benefit analysis which could deter from them providing causal estimates regarding the factor in question.
The format of the CF explanations was the same for each factor, namely ‘If Tom had [had/worked/owned etc. the factor/feature], the AI system would have predicted that his salary was higher than/equal to £30k (≥ £30k)’.
Given that this formulation of the CF explanation implies a positive impact of the factor/feature on salary, we expected that participants’ Action estimates in the
Lastly, participants received a debriefing information and were invited to provide feedback.
最後に、参加者は報告情報を受け取り、フィードバックを提供するよう招待された。
0.54
2.2 Results Familiarity with the factors affecting salary and AI systems We first analyzed the participants estimates regarding how familiar they are with factors affecting salary.
We performed a one-way ANOVA for each familiarity category (i.e. salary and AI systems) with group (Control, AI Prediction, AI Explanation) as a three-level independent variable.
We found no significant effect of group on either familiarity with factors affecting salary, F (2, 90) = 0.66, p = .52, or familiarity with AI systems, F (2, 90) = 1.96, p = .15.
f (2,90) = 0.66, p = .52, aiシステムへの親密性, f (2,90) = 1.96, p = .15。 訳抜け防止モード: 給与に影響する要因との親密性にグループの影響は認められなかった。 f ( 2, 90 ) = 0.66, p = .52, or familiarity with ai systems. F ( 2, 90 ) = 1.96, p = .15 .
0.88
Mean familiarity ratings indicated that participants were more familiar with factors affecting salary (M = 3.9) than AI systems (M = 2.8), which is expected.
These results suggests that any potential significant differences between the groups in the further analyses cannot be accounted for by the participants familiarity with the domain (salary) or AI systems.
Main analyses Participants estimates for each dependent variable are shown in Figure 1.
主な分析参加者は、各依存変数の見積もりを図1に示します。
0.81
To test the effect of group on each dependent variable we initially built three linear mixedeffects models with the random intercept for each participant.
各依存変数に対するグループの効果をテストするために、最初は3つの線形混合効果モデルを構築しました。
0.80
However, as the distributions of participants estimates were highly skewed (especially for Expectation and Action dependent variables) and as residuals of the linear mixed-effects models were clearly non-normally distributed (see Appendix B), to test for the overall effect of the group we resorted to non-parametric tests.
Participants’ expectation estimates were significantly affected by the group they were assigned to (Control, AI Prediction, AI Explanation) for Expectation estimates, H(2) = 11.9, p = .003, Confidence estimates, H(2) = 16.1, p < .001, and Action estimates, H(2) = 27, p < .001.
参加者の期待見積は、期待見積がh(2) = 11.9, p = .003, 信頼見積が h(2) = 16.1, p < .001, アクション見積が h(2) = 27, p < .001 に割り当てられたグループ(制御、ai予測、ai説明)に大きく影響された。
0.77
We performed post-hoc pairwise comparisons between the three groups using Wilcoxon Rank Sum test (for more details see Table 1 in Appendix A).
我々はWilcoxon Rank Sum test (詳細はAppendix A の Table 1 を参照) を用いて3つのグループ間のペアワイド比較を行った。
0.83
We found that participants’ Action estimates were not significantly different between Control and AI prediction groups (p = .74), but that there was a significant difference between AI Prediction and AI Explanation (p < .001) as well as between Control and AI Explanation (p < .001).
From Figure 1 we can also see that participants’ Action estimates were higher in AI Explanation group compared to the two other groups.
図1からは、参加者の行動推定が、AI説明グループでは他の2つのグループよりも高いことが分かります。
0.73
Figure 2 suggests that this effect held across the features/factors and not just overall.
図2は、この効果が全体だけでなく、機能や要素全体にも及んでいることを示唆しています。
0.49
These results provide support for our main hypothesis, namely that providing CF explanations would affect people’s beliefs about how causal are the features in the real world.
Post-hoc pairwise comparisons with respect to the participants’ Expectation estimates showed significant differences
参加者の期待値に対する対数比較は有意差を示した
0.75
between Control and AI Prediction (p = .03) as well as Control and AI Explanation (p = .002); however, AI Prediction group’s and AI Explanation group’s estimates were not significantly different, p = .36.
These results support our second hypothesis: being aware that an AI system is using certain factors/feature to make predictions and knowing what the prediction is affects people’s expectation as to how well these features/factors are predicting salary, but that their expectations will not further change upon learning about the CF explanations of the features/factors.
This also implies that the results regarding participants’ Action estimates cannot be explained by the participants’ Expectation estimates, providing further support for the claim that CF explanations are affecting people’s causal beliefs.
Post-hoc pairwise comparisons on participants’ Confidence estimates showed significant difference between Control and both AI Prediction (p = .02) and AI Explanation (p < .001) groups.
There was not significant difference between AI Prediction and AI Explanation groups (p = .12).
AI予測とAI説明グループ(p = .12)の間に有意な差はなかった。
0.85
Figure 1 shows a downward trend in estimates from the Control to the AI Explanation group.
図1は、コントロールからAI説明グループへの見積もりにおける下降傾向を示しています。
0.78
We speculate that this might be because some of features/factors the AI system uses are intuitively not relevant to salary or they are effects rather than causes of higher/lower salary.
This may result in the reduction in people’s confidence in the AI system’s predictive accuracy.
これにより、AIシステムの予測精度に対する人々の信頼が低下する可能性がある。
0.77
It is important to note that participants’ Confidence estimate were clearly different from their Expectation estimates, suggesting that these two dependent variables were successfully disentangled in the experiment design.
3 Experiment 2 Experiment 1 suggested that providing lay users with CF explanations of AI systems’ predictions can (unjustifiably) affect their causal beliefs about the features/factors.
The aim of Experiment 2 experiment was to explore whether we can correct the effects of CF explanations on people’s causal beliefs.
実験2実験の目的は、CFの説明が人々の因果的信念に与える影響を正せるかどうかを探ることであった。
0.79
Inspired by the research on correcting misinformation [Irving et al , 2022] and the research on the impact of health warning messages [Hammond, 2011], we designed this experiment to explore whether providing participants with a note communicating that AI systems are capturing correlations in data rather than causal relationships might attenuate the effect of CFs on their causal beliefs.
誤情報を訂正する研究 [Irving et al , 2022] と健康警告メッセージの影響に関する研究 (Hammond, 2011) にインスパイアされた我々は,AIシステムが因果関係ではなくデータ内の相関を捉えていることを示すメモを参加者に提供することで,CFの因果的信念への影響を弱めることができるかどうかを検討するために, この実験を設計した。 訳抜け防止モード: 誤情報の修正研究に触発された [irving et al, 2022] 健康警告メッセージの影響についての研究 [hammond, 2011] この実験は 参加者にaiシステムが因果関係ではなく、データ内の相関関係をキャプチャしていることを伝えるメモを提供する cfsが因果的信念に与える影響を弱めるかもしれない。
0.83
We hypothesise that the AI Explanation group presented with the note will provide lower Action estimates than the AI Explanation group where the note was not present.
We do not have a specific hypothesis as to how introducing the note might affect participants Expectation, Confidence, or Action estimates in the other groups or how AI Explanation groups’ Expectation and Confidence estimates might change due to the note.
The second aim of Experiment 2 is to provide a replication of Experiment 1 in groups that are not presented with the note.
実験2の第二の目的は、ノートに記載されていないグループで実験1の複製を提供することである。
0.73
Thus Experiment 2 will provide additional test for the two hypotheses explored in Experiment 1.
実験2では、実験1で探索された2つの仮説について追加のテストを行う。
0.64
3.1 Methods Effect size calculations showed that the effect size of Experiment 1 results was relatively small (η2 = .03), making Experiment 1 being slightly underpowered.
Figure 1: Experiment 1 results for each each group and each dependent variable.
図1: 各グループと各依存変数について実験1の結果。
0.86
Figure 2: Experiment 1 results for each factor/feature, each group, and each dependent variable.
図2: 実験1では、各ファクタ/機能、各グループ、各依存変数について結果が得られます。
0.65
of Experiment 2 we increased the number of participants.
実験2では参加者数を増やしました。
0.78
We aimed to have around 45 participants in each group.
各グループに約45人が参加することを目指しています。
0.69
Participants, Design, & Materials A total of 271 participants (Nfemale = 196, two participants identified as neither mare nor female, Mage = 38.7, SD = 12.2) were recruited from Prolific Academic (www.prolific.ac).
All participants were native English speakers residing in the UK or Ireland whose approval ratings were 95% or higher.
すべての参加者はイギリスまたはアイルランドに住む英語話者であり、承認率は95%以上であった。
0.78
They all gave informed consent and were paid £6.24 an hour rate for partaking in the present study, which took on average 8.6 min to complete.
全員の同意を得て、1時間あたり6.24ポンドを支払ったが、これは平均8.6分で完了した。
0.66
Participants were randomly assigned to one of 3 (Control, AI Prediction or AI Explanation) × 2 (correction: No Note or Note) = 6 between-participant groups (Control & No Note N = 46, Control & No Note N = 45, AI Prediction & No Note N = 44, AI Prediction & Note N = 46, AI Explanation & No Note N = 46, AI Explanation & No Note N = 44).
参加者はランダムに3つのグループ(Control, AI Prediction or AI Explanation) × 2 (correction: No Note or Note) = 6 個の参加グループ(Control & No Note N = 46, Control & No Note N = 45, AI Prediction & No Note N = 44, AI Prediction & Note N = 46, AI Explanation & No Note N = 46, AI Explanation & No Note N = 44)に割り当てられた。
0.88
Experiment 2 used the same three dependent variables as Experiment 1, namely Expectation, Confidence, and Action.
実験2は実験1と同じ3つの依存変数(期待、自信、行動)を使用した。
0.82
Materials in Experiment 2 were exactly the same as those in Experiment 1.
実験2の材料は実験1の材料と全く同じであった。
0.83
Procedure Experiment 2 procedure was similar to Experiment 1 procedure.
実験2手順は実験1手順と類似していた。
0.80
The only difference is that three of the 6 groups were additionally presented with a note regarding correlation, causation, and AI systems.
For groups with the note, that note was introduced in the preamble of each condition, presented on a separate page and participants were also reminded of the note before answering the questions related to the three dependent variables.
Instead, the note included general information about correlation and causation.
代わりに、メモには相関と因果関係に関する一般的な情報が含まれていた。
0.49
In the AI Prediction group the note read:
AI予測グループで、メモは以下の通りである。
0.55
Important note AI systems learn correlations in data.
重要な注意 aiシステムはデータの相関を学習する。
0.71
Even though the factors the AI system uses are potentially correlated with higher salary that does not mean that they are causing higher salary.
AIシステムが使用する要因は高給と相関している可能性があるが、高給の原因であるという意味ではない。
0.72
Here participants are told information regarding correlation and causation that is more relevant to the AI systems.
ここでは、AIシステムとより関連性の高い相関関係と因果関係に関する情報が与えられる。
0.70
Specifically, they are told that AI systems capture relationships that are correlational and should not be interpreted as causal.
具体的には、AIシステムは相関関係を捉え、因果関係と解釈するべきではない。
0.67
In AI Prediction condition the note read:
ai予測条件では、メモを読みます。
0.51
Important note AI systems learn correlations in data.
重要な注意 aiシステムはデータの相関を学習する。
0.71
Even though the factors the AI system uses are potentially correlated with higher salary that does not mean that they are causing higher salary.
AIシステムが使用する要因は高給と相関している可能性があるが、高給の原因であるという意味ではない。
0.72
Similarly, the explanations of the AI systems’ predictions are about the correlations the AI system has identified and not about which factors are actually causing higher salary.
In addition to being told that AI systems capture correlations, participants in this group were also told that the explanations of the AI system’s predictions are explanations of these correlations and not necessarily of causal relations.
0255075100Expectatio nConfidenceActionPar ticipants' estimatesGroupContro lAI PredictionAI ExplanationClothesSk iingPenthouseWatchPl antHaircutDegreeSect orSleepExpectationCo nfidenceActionExpect ationConfidenceActio nExpectationConfiden ceAction025507510002 550751000255075100Pa rticipants' estimatesGroupContro lAI PredictionAI Explanation
0255075100Expectatio nConfidenceAction Participants' estimatesGroupContro lAI PredictionAI ExplanationClothesSk iingPenthouseWatchPl antHaircutDegreeSect orSleepExpectationCo nfidenceActionExpect ationConfidenceActio nExpectationConfiden ceAction025507510002 575100Participants x27; estimatesGroupContro lAI PredictionAI Explanation
0.04
英語(論文から抽出)
日本語訳
スコア
3.2 Results Familiarity with the factors affecting salary and AI systems Like in Experiment we first analyzed the participants estimates regarding how familiar they are with factors affecting salary.
We found no significant effect of group (Control, AI Prediction, AI Explanation) on either familiarity with factors affecting salary, F (2, 265) = 0.99, p = .37, or familiarity with AI systems, F (2, 265) = 0.13, p = .88.
f (2, 265) = 0.99, p = .37, あるいはaiシステムとの親和性, f (2, 265) = 0.13, p = .88。 訳抜け防止モード: グループ(制御,AI予測,AI説明)が給与に影響を及ぼす要因に対する親しみ度に有意な影響は認められなかった。 F ( 2, 265 ) = 0.99, p = .37, F ( 2, 265 ) = 0.13, p = 88。
0.72
We found no significant effect of correction (No Note, Note) on either familiarity with factors affecting salary, F (1, 265) = 0.55, p = .46, or familiarity with AI systems, F (1, 265) = 0.06, p = .8.
給与に影響する要因, f (1, 265) = 0.55, p = .46, あるいはaiシステムへの親しみ, f (1, 265) = 0.06, p = .8に対する補正効果は認められなかった。
0.73
Finally, we found no significant interaction effect between the two independent variables on either familiarity with factors affecting salary, F (2, 265) = 1.68, p = .19, or familiarity with AI systems, F (2, 265) = 1.19, p = .31.
最後に, 給与に影響する要因, f (2, 265) = 1.68, p = .19, あるいはaiシステムへの親密性, f (2, 265) = 1.19, p = .31について, 2つの独立変数間の有意な相互作用効果は認められなかった。
0.80
Mean familiarity ratings indicated that participants were more familiar with factors affecting salary (M = 3.9) than AI systems (M = 2.9).
These results are very similar to those in Experiment 1.
これらの結果はExperiment 1のものと非常に似ています。
0.67
Main analyses Participants estimates for each dependent variable as shown in Figure 3.
主解析参加者は図3に示すように、各依存変数を推定する。
0.72
Similarly to Experiment 1, the distributions of participants estimates were skewed (especially for Expectation and Action dependent variables) and residuals of the linear mixed-effects models were clearly nonnormally distributed (see Appendix B).
Participants’ expectation estimates were significantly affected by group (Control & No Note, Control & Note, AI Prediction & No Note, AI Prediction & Note, AI Explanation & No Note, AI Explanation & Note) for Expectation estimates, H(5) = 38.9, p < .001, Confidence estimates, H(5) = 33.8, p < .001, and Action estimates, H(5) = 67.7, p < .001.
参加者の予測推定は、期待推定のためのグループ(Control & No Note, Control & Note, AI Prediction & No Note, AI Prediction & Note, AI Explanation & No Note, AI Explanation & Note)、H(5) = 38.9, p < .001, Confidence estimates, H(5) = 33.8, p < .001, Action estimates, H(5) = 67.7, p < .001)に大きく影響された。
0.93
From Figure 3 we can also see that participants’ Action estimates were significantly higher in AI Explanation & No Note group compared to the two other No Note groups.
Unlike in Experiment 1, the difference between Control and AI Prediction condition was also significant, p = .02 (see for more details see Table 4 in Appendix A).
Pairwise comparisons for depended variable Expectation show significant difference only between Control and both AI Prediction (p = .001) and AI Explanation (p < .001) groups.
No significant difference was found between AI Prediction and AI Explanation (p = .31).
AI予測とAI説明の間に有意な差はない(p = .31)。
0.82
This result provides support to our second hypothesis from Experiment 1 and suggests that even though there were significant difference between all three No Note groups in Action dependent variable, the significant difference between AI Prediction and AI Explanation cannot be accounted for by differences in Expectation estimates.
Pairwise comparisons across all three dependent variables show that the only significant difference between No Note and Note conditions was between AI Explanation & No Note and AI Explanation & Note for Action dependent variable, p = .01 (see Tables 2, 3, and 4 in Appendix A for more de-
3つの依存変数のペアワイズ比較では、No Note と Note の条件の唯一の大きな違いは、AI Explanation & No Note と AI Explanation & Note for Action 依存変数 p = .01 である(詳細は Appendix A の Table 2, 3, 4 を参照)。
0.82
tails). Participants’ Action estimates in group AI Explanation & Note were lower than those in group AI Explanation & No Note and were not significantly different than those from AI Prediction & No Note or AI Prediction & No Note.
This implies that the effect of CF explanations on participants causal belief was attenuated and no different from that in groups where CF explanations of AI systems’ predictions were not shown.
Moreover, Figure 3 suggests that participants’ Action estimates in AI Explanation & Note were lower than those in Explanation & No Note for almost all features/factors.
Finally, participants’ Confidence estimate were again different from their Expectation estimates.
最後に、参加者の信頼度推定は期待値と再び異なる。
0.72
But, unlike in Experiment 1 where there was a downward trend in participants’ Confidence estimates, Experiment 2 found that AI Prediction groups’ estimates were lower than both Control groups’ and AI Explanation groups’ estimates, and that there was no significant difference between Control groups’ and AI Explanation groups’ estimates.
In Experiment 1 we speculated that Confidence estimates might be driven by some factors/features not being relevant to salary or in an anti-causal relationship to (i.e. effects of) salary.
However, the data form Experiment 2 does not seem to support this supposition.
しかし、Experiment 2のデータ形式はこの仮定をサポートしていないようだ。
0.74
4 Discussion If one of the aims of explainable AI is to provide human users with information that will help them better understand how an AI system came to a prediction and how the system will behave in the future, then we need to communicate to that user as clearly as possible the predictive and associative (rather than the causal) nature of these systems so that the mental models humans create based on that information are more genuine and representative of the AI system’s nature.
Two experiments showed that participants’ causal estimates were significantly higher when they were presented with a CF explanation compared to both the baseline and when only the prediction was communicated.
We further found that this was not the case for people’s beliefs regarding how good are the feature/factors in predicting salary and that there was no significant difference in expectation estimates difference between the group where only predictions was presented and the group where both the prediction and a CF explanation was included.
This result suggests people’s expectation estimates cannot account for the differences in their causal beliefs and that these differences were in fact due to CF explanations alone.
We also found that one might be able to guard against the unwanted effect of CF explanations on causal beliefs.
また、cf説明の望ましくない影響が因果的信念に影響を及ぼすのを防げる可能性も見出した。
0.65
Inspired by the work on misinformation and health warning messaging, we designed a note communicating to the participants the correlational character of AI systems rather than causal.
In this study we have only briefly discussed the role of
本研究では,その役割について簡単に論じた。
0.77
英語(論文から抽出)
日本語訳
スコア
Figure 3: Experiment 2 results for each dependent variable.
図3: 各依存変数に対する実験2の結果。
0.89
Figure 4: Experiment 2 results for each factor/feature and each dependent variable.
図4: 実験2では、各因子/機能と各依存変数について結果が得られます。
0.60
people’s confidence in their expectation estimates.
期待値に対する人々の自信。
0.57
We found that confidence estimates are clearly different the expectation ones.
信頼度の推定値は明らかに期待値とは異なることが分かりました。
0.59
However, we have not explored in further detail how confidence estimates may depend on whether people are just told about the AI system’s prediction or they are also told the CF explanation.
It may be interesting to explore how confidence estimates interact with people’s estimates of how accurate they believe the AI system is in predicting the label.
Finally, a wealth of research on explanation and explanatory goodness suggests that simpler explanations have a bigger impact on our (causal) beliefs [Lombrozo, 2007; Lagnado, 1994; Read and Marcus-Newhall, 1993; Thagard, 1978].
最後に、説明と説明の良さに関する豊富な研究は、単純な説明が我々の(因果的な)信念に大きな影響を与えることを示唆している(Lombrozo, 2007; Lagnado, 1994; Read and Marcus-Newhall, 1993; Thagard, 1978)。
0.75
In our studies only one feature/factor was included in a CF explanation at a time, so our CF explanations were on the simpler side of the spectrum.
References [Bansal et al , 2014] Aayush Bansal, Ali Farhadi, and Devi Parikh.
参考文献 [Bansal et al , 2014] Aayush Bansal, Ali Farhadi, Devi Parikh
0.36
Towards transparent systems: Semantic characIn European Conference on terization of failure modes.
透過システムに向けて:semantic characin european conference on terization of failure modes。
0.82
Computer Vision, pages 366–381.
コンピュータビジョン、366-381頁。
0.79
Springer, 2014. [Bates et al , 2014] Douglas Bates, Martin M¨achler, Ben Bolker, and Steve Walker.
2014年春。 Bates et al , 2014 ダグラス・ベイツ、マーティン・M・シャックラー、ベン・ボルカー、スティーブ・ウォーカー。
0.62
Fitting linear mixed-effects arXiv preprint arXiv:1406.5823, models using lme4.
フィッティング線形混合効果 arxiv プレプリント arxiv:1406.5823, model using lme4。
0.47
2014. [Benjamini and Hochberg, 1995] Yoav Benjamini and Yosef Hochberg.
2014. [Benjamini and Hochberg, 1995]Yoav BenjaminiとYosef Hochberg。
0.41
Controlling the false discovery rate: A practical and powerful approach to multiple testing.
偽発見率の制御: 複数のテストに対する実用的で強力なアプローチ。
0.83
Journal of the Royal statistical society: series B (Methodological), 57(1):289–300, 1995.
日誌 the royal statistical society: series b (methodological), 57(1):289–300, 1995 (英語)
0.79
[Byrne, 2016] Ruth MJ Byrne.
[Byrne, 2016]Ruth MJ Byrne
0.35
Counterfactual thought. An- nual review of psychology, 67:135–157, 2016.
疑似思考。 安 nual review of psychology, 67:135–157, 2016 (英語)
0.42
[Byrne, 2019] Ruth MJ Byrne.
[Byrne, 2019]Ruth MJ Byrne
0.34
Counterfactuals in explainable artificial intelligence (xai): Evidence from human reasoning.
説明可能な人工知能(xai):人間の推論による証拠。
0.75
In IJCAI, pages 6276–6282, 2019.
ijcai, pages 6276-6282, 2019 ページ。
0.53
[Chen et al , 2014] Jessie Y Chen, Katelyn Procci, Michael Boyce, Julia Wright, Andre Garcia, and Michael Barnes.
Jessie Y Chen氏、Katelyn Procci氏、Michael Boyce氏、Julia Wright氏、Andre Garcia氏、Michael Barnes氏。 訳抜け防止モード: [ Chen et al, 2014 ] Jessie Y Chen, Katelyn Procci, マイケル・ボイス、ジュリア・ライト、アンドレ・ガルシア、マイケル・バーンズ。
0.75
Situation awareness-based agent transparency.
環境意識に基づくエージェント透明性。
0.57
Technical report, Army research lab Aberdeen proving ground MD human research and engineering, 2014.
陸軍研究所アバディーン(army research lab aberdeen)技術報告(2014年)
0.56
[DARPA, 2016] DARPA.
DARPA, 2016]DARPA。
0.34
Explainable artificial intelligence (XAI) program.
説明可能な人工知能(XAI)プログラム。
0.78
2016. Retrieved from https://www.darpa.
2016. https://www.darpa.co mから取得。
0.45
mil/program/xplainab le-artificial-intelligence.
mil/ program/xplainable-a rtificial-intelligen ce
0.16
[Del Giudice, 2021] Marco Del Giudice.
マルコ・デル・ジュディツェ(marco del giudice) 2021年。
0.60
The predictionexplanatio n fallacy: A pervasive problem in scientific applications of machine learning.
予測説明誤り: 機械学習の科学的応用における広範な問題。
0.80
2021. [Dillon et al , 2021] Eleanor Dillon,
2021. ][dillon et al , 2021] エレノア・ディロン。
0.57
Jacob LaRiviere, Scott Lundberg, Jonathan Roth, and Vasilis SyrgkaBe careful when interpreting predictive models nis.
Jacob LaRiviere氏、Scott Lundberg氏、Jonathan Roth氏、Vasilis SyrgkaBe氏は予測モデル nis の解釈に注意する。
0.70
in search of causal Retrieved from https://medium.com/t owards-data-science/ be-careful-when-inte rpreting-predictive- models-in-search-of- causal-insights-e686 26e664b6.
in search of causal Retrieved from https://medium.com/t owards-data-science/ be-careful-when-prep reting-predictive-mo dels-in-search-of-ca usal-insights-e68626 e664b6.
0.14
insights. Medium, 2021.
洞察だ 2021年。
0.55
[Doshi-Velez and Kim, 2017] Finale Doshi-Velez and Been Kim.
[Doshi-Velez and Kim, 2017]Finale Doshi-VelezとBeen Kim。
0.43
Towards a rigorous science of interpretable machine learning.
解釈可能な機械学習の厳密な科学に向けて
0.73
arXiv preprint arXiv:1702.08608, 2017.
arxiv プレプリント arxiv:1702.08608, 2017
0.43
[Fallon and Blaha, 2018] Corey K Fallon and Leslie M Improving automation transparency: Addressing
[Fallon and Blaha, 2018]Corey K FallonとLeslie Mによる自動化の透明性向上
0.78
Blaha. 0255075100Expectatio nConfidenceActionPar ticipants' estimatesGroupContro l & No NoteControl & NoteAI Prediction & No NoteAI Prediction & NoteAI Explanation & No NoteAI Explanation & NoteClothesSkiingPen thouseWatchPlantHair cutDegreeSectorSleep ExpectationConfidenc eActionExpectationCo nfidenceActionExpect ationConfidenceActio n0255075100025507510 00255075100Participa nts' estimatesGroupContro l & No NoteControl & NoteAI Prediction & No NoteAI Prediction & NoteAI Explanation & No NoteAI Explanation & Note
ブラハ 0255075100Expectatio nConfidenceAction Participants' estimatesGroupContro l & No NoteControl & NoteAI Prediction & No NoteAI Prediction & No NoteAI Explanation & No NoteAI Explanation & NoteClothesSkiingPen thouseWatchPlantHair cutDegreeSectorSleep ExpectationConfidenc eActionExpectationCo nfidenceAction025507 510002575507575100 Participants' estimatesGroupContro l & No NoteAI Prediction & No NoteAI Explanation & No NoteAI Explanation & No NoteAI Explanation & No NoteAI Explanation & No NoteAI Explanation & No NoteAI NoteAI Note
0.30
英語(論文から抽出)
日本語訳
スコア
some of machine learning’s unique challenges.
機械学習のユニークな課題のいくつか。
0.73
In International Conference on Augmented Cognition, pages 245– 254.
拡張認知に関する国際会議』245-254頁。
0.62
Springer, 2018.
2018年、スプリンガー。
0.51
[Gunning and Aha, 2019] David Gunning and David W Aha.
(gunning and aha, 2019) david gunningとdavid w aha。
International Conference on Human-Robot Interaction (HRI, pages 303–312.
人間とロボットの相互作用に関する国際会議(hri)303-312頁。
0.55
IEEE, 2017.
2017年、IEEE。
0.63
[Irving et al , 2022] Dulcie
[Irving et al , 2022]Dulcie
0.39
Irving, Robbie WA Clark, Stephan Lewandowsky, and Peter J Allen.
アーヴィング、ロビー・クラーク、ステファン・ルワンドウスキー、ピーター・J・アレン。
0.57
Correcting statistical misinformation about scientific findings in the media: Causation versus correlation.
メディアにおける科学的発見に関する統計的誤報の訂正:因果関係と相関
0.74
Journal of Experimental Psychology: Applied, 2022.
Journal of Experimental Psychology: Applied, 2022年。
0.88
[Judd et al , 2017] Charles M. Judd, Jacob Westfall, and David A. Kenny.
Judd et al , 2017] Charles M. Judd、Jacob Westfall、David A. Kenny。
0.39
Experiments with more than one random factor: Designs, analytic models, and statistical power.
複数のランダム要因を持つ実験:設計、解析モデル、統計力。
0.64
Annual Review of Psychology, 68(1):601–625, 2017.
心理学年報, 68(1):601–625, 2017年。
0.78
[Karimi et al , 2020a] Amir-Hossein Karimi, Gilles Barthe, Borja Balle, and Isabel Valera.
[Karimi et al , 2020a]Amir-Hossein Karimi, Gilles Barthe, Borja Balle, Isabel Valera.
0.45
Model-agnostic counterfactual explanations for consequential decisions.
一連の決定に対するモデル非依存な反事実的説明。
0.44
In International Conference on Artificial Intelligence and Statistics, pages 895–905.
人工知能と統計に関する国際会議、895-905頁。
0.71
PMLR, 2020.
PMLR、2020年。
0.88
[Karimi et al , 2020b] Amir-Hossein Karimi, Gilles Barthe, Bernhard Sch¨olkopf, and Isabel Valera.
[Karimi et al , 2020b]Amir-Hossein Karimi, Gilles Barthe, Bernhard Sch solkopf, Isabel Valera。
0.45
A survey of algorithmic recourse: definitions, formulations, solutions, and prospects.
アルゴリズムに関する研究:定義、定式化、解決策、展望。
0.61
arXiv preprint arXiv:2010.04050, 2020.
arxiv プレプリント arxiv:2010.04050, 2020
0.44
[Koh and Liang, 2017] Pang Wei Koh and Percy Liang.
[Koh and Liang, 2017]Pang Wei KohとPercy Liang。
0.40
Understanding black-box predictions via influence functions.
影響関数によるブラックボックス予測の理解。
0.70
In International conference on machine learning, pages 1885–1894.
機械学習に関する国際会議』1885-1894頁。
0.78
PMLR, 2017.
2017年、PMLR。
0.66
[Lagnado, 1994] David Lagnado.
デイヴィッド・ラグナド(David Lagnado) 1994年。
0.72
The psychology of explanation: A Bayesian approach.
説明の心理学:ベイズ的アプローチ。
0.66
Unpublished Masters thesis. Schools of Psychology and Computer Science, University of Birmingham, UK, 1994.
未発表の修士論文。 バーミンガム大学心理学・計算機科学科、1994年。
0.51
[Lombrozo, 2007] Tania Lombrozo.
(lombrozo、2007年)tania lombrozo。
0.44
Simplicity and probCognitive psychology,
シンプルさとプロブ認知心理学
0.70
ability in causal explanation.
因果的説明の能力です
0.74
55(3):232–257, 2007.
55(3):232–257, 2007.
0.44
[Lundberg and Lee, 2017] Scott M Lundberg and Su-In Lee.
Lundberg and Lee, 2017] Scott M LundbergとSu-In Lee。
0.42
A unified approach to interpreting model predictions.
モデル予測を統一的に解釈するアプローチ。
0.82
Advances in neural information processing systems, 30, 2017.
2017年3月30日、ニューラル情報処理システムの進歩。
0.73
[McCloy and Byrne, 2002] Rachel McCloy and Ruth MJ Byrne.
[McCloy and Byrne, 2002]Rachel McCloyとRuth MJ Byrne。
0.39
Semifactual “even if” thinking.
半現実的な“もしも”思考。
0.57
Thinking & Reasoning, 8(1):41–67, 2002.
思考と推論 8(1):41-67, 2002
0.76
[Mercado et al , 2016] Joseph E Mercado, Michael A Rupp, Jessie YC Chen, Michael J Barnes, Daniel Barber, and Katelyn Procci.
[Mercado et al , 2016] Joseph E Mercado、Michael A Rupp、Jessie YC Chen、Michael J Barnes、Daniel Barber、そしてKatelyn Procci。 訳抜け防止モード: [mercado et al, 2016]joseph e mercado, michael a rupp, ジェシー・y・チェン マイケル・j・バーンズ ダニエル・バーバー そしてケイトリン・プロッチ
0.57
Intelligent agent transparency in human– agent teaming for multi-uxv management.
マルチuxv管理のためのエージェントコラボレーションにおけるインテリジェントエージェント透過性
0.71
Human factors, 58(3):401–415, 2016.
58(3):401–415、2016年。
0.61
[Molnar et al , 2020] Christoph Molnar, Gunnar K¨onig, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Christian A Scholbeck, Giuseppe Casalicchio, Moritz GrosseWentrup, and Bernd Bischl.
Molnar et al , 2020] Christoph Molnar, Gunnar K sonig, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Christian A Scholbeck, Giuseppe Casalicchio, Moritz GrosseWentrup, Bernd Bischl. 訳抜け防止モード: ][molnar et al, 2020 ]christoph molnar, gunnar k sonig, [molnar et al, 2020]] ジュリア・ハービンジャー ティモ・フレイゼルベン スザンヌ・ダンドル クリスチャン・ア・ショルベック giuseppe casalicchio氏、moritz grossewentrup氏、bernd bischl氏。
0.57
Pitfalls to avoid when interpreting machine learning models.
機械学習モデルを解釈する際に避けるべき落とし穴。
0.73
2020. [Montavon et al , 2018] Gr´egoire Montavon, Wojciech Samek, and Klaus-Robert M¨uller.
2020. [Montavon et al , 2018]Gr ́egoire Montavon, Wojciech Samek, Klaus-Robert M suller]
0.44
Methods for interpreting and understanding deep neural networks.
ディープニューラルネットワークの解釈と理解の方法。
0.69
Digital Signal Processing, 73:1–15, 2018.
デジタル信号処理, 73:1-15, 2018。
0.71
[Poyiadzi et al , 2020] Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach.
[Poyiadzi et al , 2020]Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, Peter Flach。 訳抜け防止モード: [Poyiadzi et al, 2020 ]Rafael Poyiadzi, Kacper Sokol, ラウル・サントス - ロドリゲス、ティエル・デ・ビー、ピーター・フラッチ。
0.62
Face: feasible and actionable counterfactual explanations.
face: 実現可能かつ実行可能な反事実的説明。
0.58
In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 344–350, 2020.
aaai/acm conference on ai, ethics, and society』第344-350ページ、2020年。
0.66
[Read and Marcus-Newhall, 1993] Stephen J Read and Amy Marcus-Newhall.
Explanatory coherence in social explanations: A parallel distributed processing account.
社会的説明における説明的コヒーレンス:並列分散処理アカウント
0.81
Journal of Personality and Social Psychology, 65(3):429, 1993.
journal of personality and social psychology, 65(3):429, 1993年。
0.87
[Ribeiro et al , 2016] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin.
Marco Tulio Ribeiro氏、Sameer Singh氏、Carlos Guestrin氏。
0.42
“Why should I trust you?”
「なぜ君を信用すべきなのか」
0.63
Explaining the predictions of any classifier.
分類器の予測を説明する。
0.75
In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
第22回知識発見・データマイニング国際会議(acm sigkdd international conference on knowledge discovery and data mining)第1135-1144頁。
0.68
[Rieger et al , 2018] Laura Rieger, Pattarawat Chormai, Gr´egoire Montavon, Lars Kai Hansen, and Klaus-Robert M¨uller.
[Rieger et al , 2018]Laura Rieger, Pattarawat Chormai, Gr ́egoire Montavon, Lars Kai Hansen, Klaus-Robert M suller ] 訳抜け防止モード: [rieger et al, 2018]laura rieger, pattarawat chormai, グレグワール・モンタヴォン、ラース・カイ・ハンセン、クラウス - ロバート・m・シュラー。
0.58
Structuring neural networks for more explainable In Explainable and Interpretable Models in predictions.
予測における説明可能かつ解釈可能なモデルで説明可能なニューラルネットワークの構成。
0.70
Computer Vision and Machine Learning, pages 115–131.
コンピュータビジョンと機械学習、115-131頁。
0.85
Springer, 2018.
2018年、スプリンガー。
0.51
[Samek et al , 2017] Wojciech Samek, Thomas Wiegand, and Klaus-Robert M¨uller.
[Samek et al , 2017]Wojciech Samek、Thomas Wiegand、Klaus-Robert M suller。
0.39
Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models.
説明可能な人工知能:ディープラーニングモデルを理解し、視覚化し、解釈する。
0.59
arXiv preprint arXiv:1708.08296, 2017.
arxiv プレプリント arxiv:1708.08296, 2017
0.42
[Simonyan et al , 2013] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman.
[Simonyan et al , 2013]Karen Simonyan、Andrea Vedaldi、Andrew Zisserman。
0.37
Deep inside convolutional networks: Visualising image classification models and saliency maps.
畳み込みネットワークの奥深く:画像分類モデルと塩分マップを可視化する。
0.82
arXiv preprint arXiv:1312.6034, 2013.
arxiv プレプリント arxiv:1312.6034, 2013
0.41
[Singmann and Kellen, 2019] Henrik Singmann and David Kellen.
Singmann and Kellen, 2019] Henrik Singmann氏とDavid Kellen氏。
0.42
An introduction to linear mixed modeling in experimental psychology.
実験心理学における線形混合モデリング入門
0.69
In New Methods in Cognitive Psychology, page 4–31.
The New Methods in Cognitive Psychology』 4-31頁。
0.88
Psychology Press, 2019.
心理学社、2019年。
0.66
[Thagard, 1978] Paul R Thagard.
[thagard, 1978] ポール・r・サガード。
0.81
The best explanation: Criteria for theory choice.
最善の説明は理論選択の基準である。
0.74
The journal of philosophy, 75(2):76–92, 1978.
the journal of philosophy, 75(2):76-92, 1978年。
0.84
[Wachter et al , 2017] Sandra Wachter, Brent Mittelstadt, and Chris Russell.
Counterfactual explanations without opening the black box: Automated decisions and the GDPR.
ブラックボックスを開かずに説明する: 自動決定とGDPR。
0.69
Harv. JL & Tech.
ハーヴ jl&tech所属。
0.57
, 31:841, 2017.
, 31:841, 2017.
0.44
A Pairwise comparisons Table 1 shows post-hoc pairwise comparison for each dependent variable in Experiment 1.
ペアワイズ比較表1は、実験1における各依存変数のポストホック対比較を示す。
0.70
Tables 2, 3, and 4 include all post-hoc pairwise comparisons for each dependent variable in Experiment 2.
表2、3、4は、実験2における各依存変数に対するすべてのポストホック対比較を含む。
0.69
英語(論文から抽出)
日本語訳
スコア
Table 1: Experiment 1: Pairwise Wilcoxon Rank Sum tests p-values for all three dependent variables.
表1: 実験1: ペアワイズウィルコクソンランクは、3つの依存変数のp値をテストする。
0.72
All p-values were corrected for multiple comparisons using Benjamini and Hochberg’s false discovery rate (FDR) procedure [Benjamini and Hochberg, 1995].
Control AI Prediction Control AI Prediction Control AI Prediction
制御AI予測制御AI予測制御AI予測制御AI予測
0.66
Action Expectation Confidence AI Prediction AI Explanation
行動 期待 信頼 AI予測AIの説明
0.60
.03 .002 .36
.03 .002 .36
0.37
.02 ¡ .001
.02 ¡ .001
0.39
.12 .74 ¡ .001
.12 .74 ¡ .001
0.39
¡ .001 Table 2: Experiment 2: Pairwise Wilcoxon Rank Sum tests p-values for dependent variable Expectation.
¡ .001 表 2: 実験 2: ペアワイズウィルコクソンランク Sum は依存変数期待に対する p-値をテストする。
0.63
All p-values were corrected for multiple comparisons using the false discovery rate method.
すべてのp値は偽発見率法を用いて複数の比較で補正された。
0.69
Control Control AI Prediction AI Prediction AI Explanation
制御 制御AI予測AIAI予測AI説明
0.70
& No Note & Note
ノート・ノート・ノート
0.48
& No Note & Note & No Note
注・なし 備考 注・なし
0.44
Control & Note AI Prediction & No Note AI Prediction & Note AI Explanation & No Note AI Explanation & Note
制御・ノート ai予測・ノーノート ai予測・ノート ai説明・ノーノート ai説明・ノート
0.54
.33 .001 ¡ .001 ¡ .001 ¡ .001
.33 .001 ¡ .001 ¡ .001 ¡ .001
0.38
.02 .004 ¡ .001 .01
.02 .004 ¡ .001 .01
0.35
.68 .31 .65
.68 .31 .65
0.34
.47 .88 .59
.47 .88 .59
0.37
Table 3: Experiment 2: Pairwise Wilcoxon Rank Sum tests p-values for dependent variable Confidence.
表3: 実験2: ペアワイズ ウィルコクソンランク テスト p-値 依存変数の信頼度。
0.72
All p-values were corrected for multiple comparisons using the false discovery rate method.
すべてのp値は偽発見率法を用いて複数の比較で補正された。
0.69
Control Control AI Prediction AI Prediction AI Explanation
制御 制御AI予測AIAI予測AI説明
0.70
& No Note & Note
ノート・ノート・ノート
0.48
& No Note & Note & No Note
注・なし 備考 注・なし
0.44
Control & Note AI Prediction & No Note AI Prediction & Note AI Explanation & No Note AI Explanation & Note
制御・ノート ai予測・ノーノート ai予測・ノート ai説明・ノーノート ai説明・ノート
0.54
.08 .03 .001 .95 .31
.08 .03 .001 .95 .31
0.32
.77 .17 .08 .008
.77 .17 .08 .008
0.33
.28 .03 .003
.28 .03 .003
0.34
¡ .001 ¡ .001
¡ .001 ¡ .001
0.42
.28 Table 4: Experiment 2: Pairwise Wilcoxon Rank Sum tests p-values for dependent variable Action.
.28 表4: 実験2: ペアワイズウィルコクソンランクは、従属変数アクションのp値をテストする。
0.54
All p-values were corrected for multiple comparisons using the false discovery rate method.
すべてのp値は偽発見率法を用いて複数の比較で補正された。
0.69
Control Control AI Prediction AI Prediction AI Explanation
制御 制御AI予測AIAI予測AI説明
0.70
& No Note & Note
ノート・ノート・ノート
0.48
& No Note & Note & No Note
注・なし 備考 注・なし
0.44
Control & Note AI Prediction & No Note AI Prediction & Note AI Explanation & No Note AI Explanation & Note
制御・ノート ai予測・ノーノート ai予測・ノート ai説明・ノーノート ai説明・ノート
0.54
.63 ¡ .001 .001 ¡ .001 ¡ .001
.63 ¡ .001 .001 ¡ .001 ¡ .001
0.38
¡ .001 .003 ¡ .001 .002
¡ .001 .003 ¡ .001 .002
0.37
.39 .02 .63
.39 .02 .63
0.34
.001 .63 .01
.001 .63 .01
0.37
英語(論文から抽出)
日本語訳
スコア
B Linear mixed effects models B.1
B線形混合効果モデルB.1
0.84
Experiment 1 To estimate the effect of group on the three dependent variables we initially built linear mixed-effects models (LMM) using the “lme4” package in R [Bates et al , 2014].
実験1 3つの依存変数に対するグループの影響を推定するために、最初は r [bates et al , 2014] の "lme4" パッケージを使って線形混合効果モデル (lmm) を構築しました。
0.78
The only fixed effect was group (with three levels: Control, AI Prediction, AI Explanation).
唯一の固定効果はグループ(コントロール、ai予測、ai説明の3つのレベル)だった。
0.74
The only random effect was the intercept for participants.
唯一のランダムな効果は参加者のインターセプトだった。
0.70
There was no random slope from the participant as the design was fully between.
デザインが完全にの間にあったため、参加者からランダムな傾斜はなかった。
0.66
No random intercept for scenarios was used as the number of scenarios was low (i.e. 9) and including the scenarios as a random intercept could have lead to a reduced power of the experiment [Judd et al , 2017; Singmann and Kellen, 2019, see].
シナリオの数が少ないため、シナリオのランダムインターセプトは使用されませんでした(例9)。ランダムインターセプトとしてシナリオを含めると、実験のパワーが低下する可能性があります(judd et al , 2017; singmann and kellen, 2019, see)。
0.73
Further, a random slope for scenarios was not included as led to a singular fit model, implying that the variance of this random effect was (close to) zero.
Consequently, we resorted to the nonparametric statistical analyses outlined in the main text.
その結果,本本文で概説された非パラメトリック統計解析に依拠した。
0.72
B.2 Experiment 2 We build a similar LMM for Experiment 2.
B.2 実験2 実験2のための同様のLMMを構築します。
0.54
The only difference was that instead of only one fixed effect we now had two: condition (Control, AI Prediction, AI Explanation) and correction (No Note, Note).
唯一の違いは、1つの固定効果の代わりに、条件(Control, AI Prediction, AI Explanation)と修正(No Note, Note)の2つがありました。
0.86
The random effects structure was the same as in Experiment 1.
ランダム効果構造は実験1と同じであった。
0.76
We again plotted the residuals and found that they were not normally distributed (see Figures 7 and 8).
また、残差をプロットし、通常分布していないことを発見した(図7、図8参照)。
0.70
We then performed the same non-parametric analyses as in Experiment 1.
その後、実験1と同じ非パラメトリック分析を行った。
0.82
英語(論文から抽出)
日本語訳
スコア
Figure 5: Histograms of the LMM residuals for all three dependent variables in Experiment 1.
図5: 実験1における3つの依存変数のLMM残差のヒストグラム。
0.83
Figure 6: Quantile plots of the LMM residuals for all three dependent variables in Experiment 1.
図6: 実験1における3つの依存変数のLMM残差の量子プロット。
0.78
Figure 7: Histograms of the LMM residuals for all three dependent variables in Experiment 2.
図7: 実験2における3つの依存変数のLMM残差のヒストグラム。
0.85
Figure 8: Quantile plots of the LMM residuals for all three dependent variables in Experiment 2.