Considerable advancements have been made in various NLP tasks based on the
impressive power of large pre-trained language models (LLMs). These results
have inspired efforts to understand the limits of LLMs so as to evaluate how
far we are from achieving human level general natural language understanding.
In this work, we challenge the capability of LLMs with the new task of Ethical
Quandary Generative Question Answering. Ethical quandary questions are more
challenging to address because multiple conflicting answers may exist to a
single quandary. We propose a system, AiSocrates, that provides an answer with
a deliberative exchange of different perspectives to an ethical quandary, in
the approach of Socratic philosophy, instead of providing a closed answer like
an oracle. AiSocrates searches for different ethical principles applicable to
the ethical quandary and generates an answer conditioned on the chosen
principles through prompt-based few-shot learning. We also address safety
concerns by providing a human controllability option in choosing ethical
principles. We show that AiSocrates generates promising answers to ethical
quandary questions with multiple perspectives, 6.92% more often than answers
written by human philosophers by one measure, but the system still needs
improvement to match the coherence of human philosophers fully. We argue that
AiSocrates is a promising step toward developing an NLP system that
incorporates human values explicitly by prompt instructions. We are releasing
the code for research purposes.
These results have inspired efforts to understand the limits of LLMs so as to evaluate how far we are from achieving human level general natural language understanding.
Ethical quandary questions are more challenging to address because multiple conflicting answers may exto a single quandary.
倫理的な4次質問は、複数の矛盾する答えが1つの4次問題を示す可能性があるため、より対処が難しい。
0.39
We propose a ist system, AISOCRATES, that provides an answer with a deliberative exchange of different perspectives to an ethical quandary, in the approach of Socratic philosophy, instead of providing a closed answer like an oracle.
AISOCRATES searches for different ethical principles applicable to the ethical quandary and generates an answer conditioned on the chosen principles through prompt-based fewshot learning.
We show that AISOCRATES generates promising answers to ethical quandary questions with multiple perspectives, 6.92% more often than answers written by human philosophers by one measure, but to match the coherence of human philosophers fully.
1 Large pre-trained language models (LLMs) have brought significant breakthroughs in artificial intelligence (AI), with impressive results approach∗∗ This work was done when the author was studying at
The Hong Kong University of Science and Technology.
香港科学技術大学教授。
0.41
ing human-level in various NLP tasks (Radford et al , 2019; Raffel et al , 2020; Brown et al , 2020).
様々なnlpタスクにおけるing人間レベル(radford et al , 2019; raffel et al , 2020; brown et al , 2020)。
0.85
Explorations of their limitations and capabilities have also been made, for instance, by studying their ability to answer open-ended, real-world questions (Tafjord and Clark, 2021; Gu et al , 2021; Jiang et al , 2021; Hendrycks et al , 2020).
例えば、オープンエンドの現実世界の質問に答える能力の研究(Tafjord and Clark, 2021; Gu et al , 2021; Jiang et al , 2021; Hendrycks et al , 2020)によって、それらの限界と能力の探索も行われている。
0.85
Ethical quandary questions can be viewed as one of the most challenging forms of questions to address because they have no single definite answer.
Instead, a discussion with multiple perspectives (i.e., a manner of debate) is crucial (Talat et al , 2021; Hendrycks et al , 2020) and sophisticated logical reasoning is required to answer such questions.
代わりに、複数の視点(すなわち議論の方法)による議論が不可欠であり(talat et al , 2021; hendrycks et al , 2020)、そのような質問に答えるためには洗練された論理的推論が必要である。 訳抜け防止モード: 代わりに、複数の視点による議論 (すなわち議論の方法) 重要である(Talat et al, 2021 ; Hendrycks et al, 2020) このような疑問に答えるには 洗練された論理的推論が必要です
0.84
In this work, we challenge the capability of LLMs to provide relevant and nuanced answers to ethical quandary questions in the style of a human ethicist — ETHICAL QUANDARY GENERATIVE QUESTION ANSWERING (GQA).
Previously, Jiang et al (2021) proposed Delphi, a model that learns to reproduce human moral and ethical judgments.
従来、jiang et al (2021) は人間の倫理的・倫理的判断を再現するモデル delphi を提案した。
0.72
However, it provides one simple definite answer to an ethical quandary question without the possibility of future discussion.
しかし、これは将来の議論の可能性なしに、倫理的4つの疑問に対する単純な明確な答えを提供する。
0.51
An ethical quandary is a moral dilemma that posses challenges to humans.
倫理的四段目は、人間に挑戦を与える道徳的ジレンマである。
0.50
For example, Delphi answers “No” to the famous ethical dilemma trolley problem (Thomson, 1976): “Should we kill one person to save five people in danger of being hit by a trolley?”.
Although the oracle at Delphi only gave a prophetic, closed answer to the questions posed to it, there can be multiple perspectives on this problem depending on the underlying ethical principle.
From the deontological perspective, the answer would be “No” because killing is never acceptable.
デオントロジーの観点からすると、殺人は決して許容されないため、答えは「ノー」である。
0.59
From the utilitarian perspective, by contrast, the answer would be “Yes” because the principle dictates that the most appropriate action is the one that results in the greatest good for the greatest number of people.
As Talat et al (2021) criticized, onesided normative ethical judgment answer makes
talat et al (2021) が批判しているように、一方的な規範的倫理判断の答えは
0.58
英語(論文から抽出)
日本語訳
スコア
it cannot represent incommensurable and diverse ethical judgments.
不当で多様な倫理的判断を 表すことはできない
0.62
Therefore, we aim to build a system that can deal with ethical quandary questions with different ethical principles and also with the possibility of explaining the reasons for its pronouncements.
Instead of handing over our ethical responsibility to the AI system by seeking a definite answer, we build an AI interlocutor with which we think through the ethical issues.1
We approach ETHICAL QUANDARY GQA task with the aim of achieving an AI system that can enhance humans’ moral decision-making through the deliberative exchange of different perspectives to an ethical quandary, which is in the approach of Socratic philosophy.
The utilization of AI technology for human moral enhancement has been suggested by moral philosophers (Savulescu and Maslen, 2015; Giubilini and Savulescu, 2018; Lara and Deckers, 2020).
道徳的哲学者(Savulescu and Maslen, 2015; Giubilini and Savulescu, 2018; Lara and Deckers, 2020)によってAI技術の利用が示唆されている。
0.79
The AI system can serve as a helper that can aid humans in having reflective equilibrium by suggesting different aspects that individuals could not take into consideration due to personal biases and prejudices (Giubilini and Savulescu, 2018).
aiシステムは、個人バイアスや偏見のために個人が考慮できないさまざまな側面(giubilini and savulescu, 2018)を示唆することで、人間に反射平衡を持つのを助けるヘルパーとして機能することができる。
0.74
Nevertheless, when using AI technology to deal with ethical issues, it is essential to consider the safety and ethical implications.
Letting an AI system answer ethical questions without a human agent can be controversial because it is unclear who takes responsibility for the action or output of the system (Anderson and Anderson, 2007; Cave et al , 2018).
人間のエージェントなしでAIシステムを倫理的問題に答えさせることは、システムのアクションやアウトプットの責任を誰が負うのか不明であるため、議論の余地がある(Anderson and Anderson, 2007; Cave et al , 2018)。
0.76
Therefore, we design our system to be controllable by humans – i.e., it allows a way for humans to provide explicit ethical principles to guide the system in generating answers.
In this way, the principle-provider (human) will be responsible for any potentially unsafe generated output.Through this setup, we also want to explore the potential for human agents to explicitly state the desirable values and principles while using few shot learning from the LLMs.
1Here, our aim is not to generate the most “ethical” answers but to explore LLMs’ ability to provide distinct answers to a single quandary depending on varying ethical principles.
Table 1: An example ethical quandary, which consists of a Context and Question.
表1: コンテキストと質問からなる倫理的な四分の一の例。
0.64
In this work, we explore the model’s ability to differently answer the quandary based on different ethical principles.
この研究では、異なる倫理的原則に基づいて四項を異なる解答するモデルの能力について検討する。
0.65
ples. The main experimental result shows that AISOCRATES could achieve a promising performance of prompt-based few-shot learned principlegrounded answer generation for the task.
Our contribution is two-fold: First, we propose the ETHICAL QUANDARY GQA task, which does not entail a single definite answer for each ethical question; instead, distinct answers exist depending on underlying ethical principles.
Secondly, we introduce AISOCRATES, which uses the promptbased few-shot learning approach with two-step prompting to answer ethical quandary questions with multiple ethical perspectives.
2 ETHICAL QUANDARY GQA Task Setup We investigate a model’s ability to answer ethical quandaries with multiple perspectives based on different underlying ethical principles.
Given the ethical quandary question Q in context, the model is expected to generate a freeform text answer(s) A in a paragraph(s).
文脈における倫理的四項質問Qを考えると、このモデルは段落Aでフリーフォームテキスト回答を生成することが期待される。 訳抜け防止モード: 文脈における倫理的4次質問Qを考えると、そのモデルは期待されている to generate a freeform text answer(s ) A in a paragraph(s)
0.72
In this task, the ethical quandary question consists of context (c) and a question sentence (q), as illustrated in Table 1.
このタスクでは、倫理的な第四次質問は、表1に示すように、文脈(c)と質問文(q)からなる。
0.69
The context includes details of the situation (e g , narrator details, a specific event, involved parties, a particular condition) from the perspective of a narrator in the form of text paragraphs.
Ethical quandary refers to a perplexity arisen by a situation in which it is hard to decide what to do morally, and, in a more strict sense, an ethical dilemma where neither of possible choices unambiguously acceptable.
The principle selection can be done either by a human or model-based (automated).
原則の選択は人間またはモデルベース(自動化)で行うことができる。
0.87
Then, the final answer A is obtained by two consecutive generations with the previously selected principles, <p1, p2>, so the answer contains multiple perspectives as addressing the quandary.
Dataset New York Times Ethicist Columns (NYTEthicist) is a set of weekly columns on ethical quandaries written by professional philosophers.
Dataset New York Times Ethicist Columns (NYTEthicist) は、プロの哲学者によって書かれた倫理的四分儀に関する週刊コラムである。
0.66
Each quandary is sent from a reader, describing a complex situation and a question arising from it.
各四分儀は読者から送られ、複雑な状況とそれに起因する質問を記述する。
0.63
A corresponding answer to the quandary is written by a philosopher based on an ethical principle(s) (not always explicitly mentioned) and usually provides multiple perspectives for the situation to address the question.
We collected 1,295 pairs of {quandary, answer from a philosopher} from the
1,295組の『quandary, answer from a philosopher』を the から収集した。
0.68
NYT website,2 using BeautifulSoup software.3
NYT website,2 using BeautifulSoup software.3
0.35
The statistics for lengths of text are given in Table 2.
テキストの長さの統計は表2で示される。
0.80
3 Methodology 3.1 AISOCRATES Principles in ethics are statements expressing reasons for or against an action.
3 方法論 3.1 倫理の原則は、行動の理由または反抗を表す言明である。
0.72
Different principles focus on different aspects of the same situation to judge what is ethical and correct (Bass et al , 1999; Forsyth and Pope, 1984).
異なる原則は、倫理的かつ正しいかを判断するために同じ状況の異なる側面に焦点を当てている(Bass et al , 1999; Forsyth and Pope, 1984)。
0.83
Thus, different ethical principles result in distinct and even contradictory answers to the same ethical quandary question.
したがって、異なる倫理原則は、同じ倫理的4つの疑問に対して、明確で矛盾する答えをもたらす。
0.54
Inspired by the characteristics of the ethical quandaries, we propose AISOCRATES, which can explicitly
倫理的四分儀の特徴に触発され,明快なAISOCRATESを提案する。
0.55
i) retrieve or generate all ethical principles that are relevant to the given ethical quandary and
3.2 Principle Provider The principle pool for addressing ethical quandaries is composed of ethical principles from western/eastern ethical theories and rules-of-thumb (RoT).
The ethical principles are theoretical and broadly described so as to apply to various contexts and situations.
倫理的原則は、様々な文脈や状況に適用するために理論的に広く記述されている。
0.70
In contrast, the RoTs, which can be considered as informal principles, are based
対照的に、非公式の原則とみなされるRoTは、ベースとなっている。
0.61
2https://www.nytimes .com/column/
2https://www.nytimes .com/column/
0.17
the-ethicist; We crawled data from 7 May 2006 to 2 November 2021.
2006年5月7日から2021年11月2日までのデータ収集を行った。
0.52
A column page from a date contains two to three pairs of {quandary, answer from the philosopher}
日付のコラムページは,2対から3対のクアンダリー,哲学者からの回答を含んでいる
0.69
3https://www.crummy. com/software/Beautif ulSoup/
3https://www.crummy. com/software/Beautif ulSoup/
0.15
英語(論文から抽出)
日本語訳
スコア
on practical experience and describe an approximate judgment on the action in a specific context or situation.
特定の状況や状況において、実際の経験と行動に関する近似的な判断を記述すること。
0.67
For the RoTs, we utilize the SOCIALCHEMISTRY (Forbes et al , 2020) and MORALSTORY (Emelin et al , 2020) datasets, which have a short real-life context and corresponding annotated RoTs.
RoT には,短い実生活環境とそれに対応する注釈付き RoT を持つ SOCIALCHEMISTRY (Forbes et al , 2020) と MORALSTORY (Emelin et al , 2020) データセットを利用する。
0.83
To illustrate with an example, for the situation “Running the blender at 5am will wake up my neighbors, but I need it for my breakfast,” one of the possible relevant ethical principles is “The morality of an action depends on the action’s outcome or result” (consequentialism), while one of the possible RoTs is “You have the right to prepare food when you need to,4” which is more contextspecific.
例えば、“午前5時にブレンダーを実行すると隣人が目を覚ますが、朝食にはそれが必要だ”という状況では、関連する倫理的原則の1つとして、“行動の道徳は行動の結果や結果に依存する”(帰結主義)が挙げられます。 訳抜け防止モード: 例を挙げるならば,“状況”です。 午前5時にブレンダーを走らせる 隣人の目を覚ましますが 私は朝食にそれが必要です。 関連する倫理的原則の1つに「行動の道徳は行動の結果に依存する」というものがある。 or result ” (consequentialism )、一方、"you have the right" である。 必要ならば,“4”というコンテキストに特化した食べ物を用意してください。
0.70
By combining both in the pool, we have access to the different granularities of ethical principles.
プールで両方を組み合わせることで、倫理的な原則の異なる粒度にアクセスできるのです。
0.73
Choice of Principle Provider – Human or Model The principle(s) can be provided by a human or a model as illustrated in Figure 1.
For the model choice of human provider, humans can freely provide relevant ethical principles or RoTs for answering the target ethical quandary question.
This option is a safety measure to ensure the existence of an agent for the machine generation (i.e., the person liable for any potential risk or harm derived from the model generation) and to provide human-in-the-loop moderation of model behavior.
For the model-based principle provider, we first form principle candidates pool using both retrieval and generative techniques to maximize the recall of relevant principles.
This principle “candidates” pool is a set of the shortlisted relevant principles out of the principle pool mentioned in the beginning of this subsection.
Then, two most relevant and contrasting principles are automatically selected out of the pool, utilizing relevance scorers, which will be explained later in details.
5 First, the principle candidates pool is formed as follow:
5 第一に原則候補プールを次のように形成する。
0.76
• The retrieval method selects the top-10 RoTs by calculating semantic textual similarity between the contexts from SOCIALCHEMISTRY and MORAL-STORY and the context of a ethical quandary test sample.
4The example is from (Forbes et al , 2020) 5Disclaimer: It is important to note that this automatic selection should not be directly used in real-application due to ethical concerns.
4 例: (forbes et al , 2020) 5 disclaimer: 倫理上の懸念から、この自動選択が実際のアプリケーションで直接使用されるべきではないことに注意すべきです。
0.78
We advise researchers to only use the automated option for investigating the upper bound of LLMs’ ability in ETHICAL QUANDARY GQA
With prompt-based fewshot learning, we generate principles by asking for relevant ethical principles to answer the question with two-shot samples with the prompt “Context: {<ethical quandary>} Q: What are the ethical principles to consider in the situation of Context?
\n A: This case illustrates several ethical principles.
\n A: このケースはいくつかの倫理的原則を示しています。
0.52
\n1.” We adopt the model generation pattern of answering with “This case illustrates several ethical principles” as a part of the prompt so that the model can be encouraged to perform the task.
Then the generated ethical principles are processed to be added into the pool of principle candidates.
次に生成された倫理原則を処理して、原則候補のプールに追加する。
0.71
Compared to the retrieval method, the generative method enables us to have more context-specific ethical principles or RoTs as the models take the context as input while the retrieval method retrieves the most approximated principles from comparison with the existing context.
Next, all the retrieved and generated principle candidates are then included in the principle candidates pool.
次に、検索および生成された原則候補はすべて、原則候補プールに含まれる。
0.78
Out of these relevant candidates, we select two principles for LLM answer grounding to encourage the generation to contain multipleperspectives .
これらの候補の中から,LLM回答基盤の原則を2つ選択し,複数のパースペクティブを含む生成を促す。
0.65
To select these two opposing principles, we conduct the following final step: the highest-ranking principle is selected by the voting result from a suite of off-the-shelf scorers – TF-IDF, Sentence-BERT and T0 model (Sanh et al , 2021) – which calculate the relevance score between the ethical quandary and candidate principles.
これら2つの対立する原則を選択するために、我々は次の最終段階を行う: 最高位原理は、倫理的4つの原則と候補の原則の間の関連性スコアを計算する、既成のスコアラー(TF-IDF、Sentence-BERT、T0モデル(Sanh et al , 2021)の組から投票結果によって選択される。
0.62
The topranking principle is denoted as p1.
上位の原理は p1 と表される。
0.67
To obtain one more principle, we select the most semantically distinct principle from p1 out of the top-10 candidate principles and denote it as p2.
In this way, we expect to have relevant to context but contrasting principles.
このように、コンテキストに関連するが原則とは対照的なものになるでしょう。
0.60
英語(論文から抽出)
日本語訳
スコア
Using the TF-IDF and Sentence-BERT scorers, we extract features of the candidate principles and the context of the testing sample and calculate the cosine-similarity between them.
Then, each of the principles is ranked in descending order of the similarity score.
そして、それぞれの原則を類似度スコアの下位順にランク付けする。
0.64
Meanwhile, for the T0 model, which is known for its powerful zero-shot generalization ability in unseen NLP tasks by utilizing prompts, we provide the prompt “Context: {context} Principle: {principle} \n Is the principle relevant to the context?
The smaller the number is, the more prioritized the principle is to be selected.
数値が小さいほど、より優先順位付けされた原則が選択される。
0.74
3.3 Principle-Grounded Answer Generator
3.3 原理型アンサージェネレータ
0.54
Given relevant principles from the first step, we adopt prompt-based few-shot learning for principlegrounded answer generation.
最初のステップから関連する原則を考慮し、原理的回答生成に即時に基づく数発の学習を採用する。
0.61
The prompt-based few-shot learning teaches the model with only a few input-output pair samples as a natural language prompt concatenated with the input of the test sample.
The prompt-based learning does not involve any parameter tuning and is a good way to test the in-nature ability of pre-trained LLMs with a minimum guidance about the task.
Each ethical quandary test sample has ethical quandary <Q> and two relevant contrasting ethical principles (<p1>, <p2>), which are provided from the previous procedure.
We have several elements in the input to get the output A. Instead of simply concatenating the input-output, We manually craft prefix prompts utilizing templates to format the input for the prompt learning.
入力出力を単純に結合するのではなく、テンプレートを利用してプレフィックスプロンプトを手作業で作成し、入力を素早い学習のためにフォーマットする。 訳抜け防止モード: 入力にいくつかの要素があり、出力 a を得ることができます。 テンプレートを活用したプレフィックスプロンプトを手作業で作成し,入力をプロンプト学習用にフォーマットする。
0.72
Since there is more than one ethical principle, we propose to do multistep prompting of the LLM to incorporate two principles in the final generated answer A addressing the ethical quandary, as illustrated in Figure 1.
(1) where 2-Shot is prepared using the first template PROMPT1 with the corresponding human-written answer concatenated.
1)第1テンプレートProMPT1を用いて、2-Shotを作成し、対応する人文回答を連結する。
0.61
Then, we continue answer generation with the second step, prompting with the second ethical principle <p2> to obtain the second answer A2 using the second template PROMPT2 “On the other hand, according to the principle of <p2>”.
Given two-shot training samples (2-Shot), the ethical quandary sample (PROMPT1) and the first generated answer (A1), the model generates the second answer A2 by sampling from
We expect this consecutive prompting to allow the model to incorporate the previous answer A1 when generating A2 so that the overall answer will be more coherent.
If irrelevant ethical principles are provided, the answer cannot be correctly generated – for example, the principle "You should avoid telling lies," is not applicable to the trolley problem.
Therefore, to ensure the provided principle is noise-free and context-relevant, principles for this experiment are carefully written and provided by a human.
We select five ethical quandaries available online.
オンラインで利用可能な倫理的四分儀を5つ選びます。
0.38
An example can be found in Table 1.
例を表1に示すことができる。
0.76
For each ethical quandary test sample, two ethical principles (p1, p2) are manually selected and corresponding answers (Ap1, Ap2) are generated with the PROMPT1 described in Section 3.3.
In Table 3, the examples (p1, Ap1) and (p2, Ap2) answering an ethical quandary in Table 1 are illustrated.
表3では、表1の倫理的四項に答える例(p1,Ap1)と(p2,Ap2)が示される。
0.71
Then, the two generated answers for each ethical quandary sample are separately evaluated by a professional philosopher with the following criteria mainly:
Principle 2 (p2): Individual’s freedom should be considered as the top priority.
原則2(p2): 個人の自由を最優先事項とするべきである。
0.78
Model Generated Answer 2 (Ap2): The Value demands individual’s freedom as the top priority.
モデル生成回答2(ap2): 価値は個人の自由を最優先事項として要求する。
0.81
Especially when individuals are at the risk of serious harm, it is not acceptable for the government to force them to do something that may endanger their health and safety.
Governments should provide more welfare support and subsidies to poor citizens and avoid placing excessive restrictions on their personal freedom.
政府は貧しい市民に福祉支援と助成金を提供し、個人の自由に過剰な制限を課すことを避けるべきである。
0.60
Table 3: Distinct answers generated by principle-guided model based on different ethical principles, obtained from preliminary exploration.
表3: 予備探索から得られた異なる倫理原理に基づく原則誘導モデルによって生成される個別回答。
0.76
The answers are in response to the ethical quandary introduced in Table 1.
回答は、表1で導入された倫理的四分儀に反応する。
0.59
4.2 Analysis with Philosophical Reference Based on the analysis with the philosophical reference, the model with few-shot prompt-based learning can generate different answers based on the ethical principles.
For instance, for an ethical principle (p1), “A smaller sacrifice is morally justifiable for the greater good,” the generated answer even includes numerical reasoning (“1 in 10000”) to quantify the verbal input “small risk.”
例えば、倫理的原則(p1)では、「より小さな犠牲はより大きな善に対して道徳的に正当化できる」とされ、その答えには「小さなリスク」という動詞の入力を定量化する数値的推論(「1 in 10000」)さえ含まれている。
0.70
Another illustration of the model-generated answer’s consistency is the answer that sticks with its “freedom-first” input principle (p2) and repeats it in every sentence.
It is stated negatively as government should “avoid placing excessive restrictions on their personal freedom” in the last sentence in (Ap2).
政府は、前文(ap2)において「個人の自由に過度な制限を課すべきではない」と否定的に述べられている。 訳抜け防止モード: 政府として否定的に述べられている。 彼らの個人の自由に過剰な制限を課すことを避ける」 the last sentence in (Ap2 ) .
0.72
However, it is noted that the model doesn’t always achieve consistency.
しかし、それは モデルが常に一貫性を実現するとは限らない点に注意が必要だ。
0.61
Although it can answer based on the input, it sometimes fails when elaborating the rationale/logic based on the input.
入力に基づいて答えることができるが、入力に基づいて合理化/論理化を行うと失敗することがある。
0.60
The answer sometimes discusses the divergent points of concern that the ethical quandary question seeks to resolve, resulting in that the question, the principle, and the answer becoming muddled.
The model-generated answers sometimes lack relevance and attention to detail.
モデルが生成する答えは、しばしば詳細への関連性や注意を欠く。
0.60
Also, extra information (not necessarily factually wrong) and re-asking and re-answering the question make the generated output redundant in paragraph/writing organization.
Understanding the model’s ability in distinct answer generation alongside its weakness in a consistency and logical generation, we investigate the model’s ability to answer ethical quandaries from multiple perspectives with our proposed method in the following experiment.
Although our method only needs a few samples for learning, we still make it into a full split, so fine-tuning small/medium-sized model methods could also be explored for ETHICAL QUANDARY GQA in the future.
We only take two samples from the train split for few-shot learning with our methodology.
列車のスプリットからサンプルを2枚だけ取り、数ショットの学習を方法論で行いました。
0.62
We then obtain an answer for each of the 130 test samples from AISOCRATES, which is backboned by one of the largest publicly available pre-trained LLMs – Jurassic-1 Jumbo (Lieber et al , 2021) with 178 billion parameters.6
次に、AISOCRATESから130の試験サンプルのそれぞれについて回答を得る。これは、最も広く公開されているLLMの1つ、Jurassic-1 Jumbo (Lieber et al , 2021)と178億のパラメータを持つ。
0.69
We mainly evaluate the model performance with a human evaluation due to the one-to-many nature of generation tasks.
我々は,生成タスクの1対1の性質から,モデル性能を人間評価で主に評価する。
0.77
The automatic metric with a reference is often limited in evaluating the desired quality in generations.
参照を持つ自動メトリックは、しばしば世代で望ましい品質を評価する際に制限される。
0.81
Moreover, as explained earlier, the ethical quandary question has multiple valid answers depending on the relevant ethical principles.
さらに、先述したように、倫理的4次問題には、関連する倫理的原則によって複数の有効な答えがある。
0.48
This makes our evaluation more challenging with the automatic metrics.
これにより、自動メトリクスでは評価がより難しくなります。
0.58
Thus, we need human judgment in the performance evaluation.
したがって,評価には人間による判断が必要である。
0.67
For completeness, however, we also perform evaluation using the standard automatic metric (ROUGE) and show in Section 6 how there is a poor correlation with human judgment.
Evaluation Criteria We evaluate the model’s ETHICAL QUANDARY GQA ability with two most relevant metric for assessing the success of the Socratic answer generation:
The ability to provide more than one point of view to the ethical quandary can be interpreted as the model’s potential to carry out a deliberative discussion on the issue.
It is important to ensure diverse ethical judgments through multipleperspective for answering ethical quandary question.
倫理的4つの疑問に答えるためには,多視点による多様な倫理的判断を確実にすることが重要である。
0.38
Moreover, it is unsafe for AI system to provide a single-sided authoritative normative judgments (Talat et al , 2021).
さらに、aiシステムが単独の権威規範的判断(talat et al, 2021)を提供することは安全ではない。
0.68
We evaluate whether the model answers the quandary question from different angles.
モデルが4次質問に異なる角度から答えるかどうかを評価する。
0.67
6Note that our methodology is model-agnostic although
6我々の方法論はモデルに依存しない。
0.48
we conducted experiment with Jurassic-1 Jumbo.
ジュラシック1ジャンボ実験を行った。
0.64
Multi-perspective Coherence
マルチパースペクティブコヒーレンス
0.71
AISOCRATES vs. NYT-Ethicist none win 11.54 25.38 20.00 7.692
AISOCRATES vs. NYT-Ethicist no win 11.54 25.38 20.00 7.692
0.28
loss 18.46 50.00
損失18.4650.00
0.29
tie 44.62 22.31
ネクタイ 44.62 22.31
0.39
Table 4: Win-tie-loss rates (%) for comparison between AISOCRATES (model-generated) and NYTEthicist (philosopher-written ) answers for evaluation criteria.
Rates are in regard to the model performance against human-written answer.
レートは、人間による回答に対するモデルパフォーマンスに関するものだ。
0.64
For instance, AISOCRATES wins 25.38%, ties 44.62%, and loses 18.46% of the time versus the NYT-Ethicist answer while 11.54% of the time neither of them is chosen to have multiple perspectives in the answer.
• Coherence: We test whether the generated output is logically composed and connected while sticks with the focus of the ethical quandary question throughout.
•コヒーレンス: 生成されたアウトプットが論理的に構成され、接続されているかどうかをテストします。
0.64
As the expected output is in form of paragraphs rather than a simple sentence, this criterion is also important as it is one of the core conditions for rational discourse writing.
To evaluate the two criteria, we ask the following questions of the annotators – “Which of the answers is addressing the ethical dilemma from multiple perspectives?” and “Which answer is more coherent?”
5.2 Experimental Results Q1: Can AISOCRATES answer from multiple perspectives?
5.2 実験結果 Q1: AISOCRATES は様々な観点から答えられるか?
0.71
This is the aspect of most interest in ETHICAL QUANDARY GQA.
これはETHICAL QUANDARY GQAにおける最も関心のある側面である。
0.70
As reported in Table 4, AISOCRATES provides the answer with multiple perspectives 70% of the time (the sum of win and tie cases), which is 6.92% more than the NYT-Ethicist answers do.
Of course it is. Even though it will mean nothing, you’re showing your mother how much you care for her.
もちろんそうだ。 何も意味がないとしても、母親にどれだけ彼女を気にしているかを見せている。
0.60
That’s love. And if that’s on her calendar for her birthday, then it’s a present she deserves.
それは愛だ。 そして、それが彼女の誕生日のカレンダーにあるなら、彼女にふさわしいプレゼントだ。
0.67
But if you really believe Obama is a better president, then I think you should vote for Obama.
しかし、もし本当にオバマが良い大統領だと信じているなら、オバマに投票すべきだと思う。
0.64
You can show your appreciation in another way.
別の方法で感謝を示すことができます。
0.56
Take your mother to the voting booth and give her a big hug afterward.
お母さんを投票ブースに連れて行って 後で大きな抱擁をして
0.77
That’s worth two votes to your mother.
母親に投票する価値は2つある。
0.64
From a different perspective, if you were really interested in your mother’s well-being, you’d vote for the best candidate, not just the one who offers the biggest material rewards.
Romney will cut taxes for every rate payer, not just your mother.
ロムニーは母さんだけでなく、利率の支払いごとに税金を減らします。
0.61
This is a weird gift (and, frankly, kind of a dumb idea).
これは奇妙な贈り物だ(率直に言って、馬鹿げた考えだ)。
0.72
But whom you vote for, or if you vote at all, is always your democratic prerogative.
しかし、誰に投票するか、あるいは全く投票するかは、常にあなたの民主的信条です。
0.79
You can use whatever reasoning you want, including bad reasoning.
悪い推論など、好きな推論は何でも使えます。
0.64
Welcome to America.
アメリカへようこそ。
0.86
Table 5: An illustration of the AISOCRATES (model-generated) answer providing multiple perspectives on the ethical quandary question with understanding of the context while having a low Rouge-1 F1-score (22.35).
The two-step generation with two ethical principles might have achieved this result.
2つの倫理的原則を持つ2段階の世代はこの結果を得たかもしれない。
0.51
AISOCRATES can not provide multiple perspectives 27.27% of the time.
AISOCRATESは27.27%の時間を複数の視点で提供できない。
0.68
For such cases, selected principles from the principle provider step convey a similar meaning to ethical principles.
そのような場合、原則提供者ステップから選択された原則は、同様の意味を倫理的な原則に伝達する。
0.59
To elaborate, two ethical principles can sometimes have overlapping ideologies: “You should avoid telling lies” and “A person is honorable and moral by being honest”.
Not surprisingly, the NYT-Ethicist answers, which are written by moral philosophers, are selected to be more coherent than those from AISOCRATES half of the time (50.00%).
About 42% of the time, the AISOCRATES answers are still considered coherent – more coherent (18.18%) or as coherent (21.21%) compared to the expertwritten answer.
As pointed out in the preliminary analysis, the machine generated answers show the weakness of the model losing attention or containing irrelevant or redundant content while building the arguments, even though the answer starts by aligning with the provided principle in the beginning.
There is no explicit guidance or learning for improving coherence in the current methodology of AISOCRATES, except for the provision of coherent examples in the few-shot samples
and the consecutive two-step generation rather than two separate generations.
そして、2つの別々の世代ではなく、連続する2段階の世代。
0.55
This points to potential research on enhancing the reasoning capability of the model.
これにより、モデルの推論能力の向上に関する潜在的な研究が示される。
0.71
6 Analysis and Discussion 6.1 Automatic Metric and Model
6 分析と議論 6.1 自動計測とモデル
0.94
Performance Besides the human evaluation, we score the generated answer with the automatic metric ROUGE (Lin, 2004) (Table 6) in reference to expert-written NYT-Ethicist answers.
We mainly investigate F1-scores to understand how much content of the NYT-Ethicist answers and how much distinct content exists in AISOCRATES’s answers However, like other open-ended generation tasks (e g , story generation), the reference-based metric cannot always be the absolute evaluation standard because of its one-to-many nature.
Moreover, Given that an ethical quandary can be answered with different ethical principles, if the answers from AISOCRATES and NYT-Ethicist do not share the same underlying ethical principles, they would still have low n-gram overlapping while containing multiple perspectives and being coherent.
of generated answers, we analyz the generated answers with a low (< ROUGE-1 F1 − σ) and high (> ROUGE-1 F1 + σ) ROUGE scores.
生成した回答について, 生成した回答を低 ( ROUGE-1 F1 − σ) かつ高 (> ROUGE-1 F1 + σ) ROUGEスコアで解析する。
0.78
ROUGE and Multi-Perspective We investigate the relationship between ROUGE scores and the ability to provide multiple perspectives by checking how often the AISOCRATES answers are evaluated to have multiple perspectives.
For both generated answers with a low ROUGE and a high ROUGE, 75% of the answers contain multiple perspectives.
低ROUGEと高ROUGEの両方で生成された回答に対して、75%の回答は複数の視点を含んでいる。
0.62
This shows that ROUGE does not reflect the model’s ability to provide multiple perspectives.
これは、ROUGEが複数の視点を提供するモデルの能力を反映していないことを示している。
0.57
We can also investigate that AISOCRATES answers with a low ROUGE provides multi-perspective while NYT-Ethicist answer is single-sided.
また、aisocrates answers with a low rougeはマルチパースペクティブであり、nyt-ethicist answerはシングルサイドである。
0.53
For instance, in Table 5, the AISOCRATES provides a well-rounded answer with the perspectives “it is okay to do so out of love” and “you should vote for the candidate whom you believe to be better.”
It even suggests another way of showing appreciation to the narrator’s mom on her birthday.
また、誕生日にナレーターの母親に感謝を示す別の方法も示唆している。
0.72
In contrast, the NYT-Ethicist only suggests “it is okay to do so”.
それとは対照的に、NYT-Ethicistは“それは問題ない”としか言っていない。
0.58
This examines that a low ROUGE score (less overlap with human-written answers) does not necessarily indicate poor performance of the model in answering the ethical quandary.
An ethical dilemma is a situation in which any choice involves violating some widely held moral principle.
倫理的ジレンマとは、あらゆる選択が広く保持されている道徳原則に違反する状況である。
0.62
Among various possible desired virtues for “ideal answer” to ethical quandary questions, we seek the virtue of providing multiple perspectives for AISOCRATES.
This is because we believe discussing the quandary from distinct perspectives is the most robust and safest way to deal with questions involving ethics; ethical judgment is dynamic (Bicchieri, 2005) where what is considered to be
However, there may be some quandaries that are not dilemmas in a strict sense – cases where there is some initial puzzlement or doubt but where, upon further analysis, it turns out that there can be one viable answer.
Given that ethical quandary test samples from NYT are from the general public, it may be that some of the questions being offered to NYT Ethicist only merits a single responsible answer from an ethicist, regardless of differences in principles one embraces.
Strictly speaking, the virtues that are deemed in answers for ethical quandary questions involve more extended criteria other than multiple perspective and coherence.
They may include, but not limited to, understanding of the context, choices of relevant ethical concepts, deliberation from multiple perspectives, justification of the stances it take and, even more strictly, the style of writing (e g composition of paragraphs).
Other challenges are left to address involve better justification and reasoning for ETHICAL QUANDARY GQA.
その他の課題には、ETHICAL QUANDARY GQAのより良い正当化と推論が含まれる。
0.63
We have found that the generated answers sometimes seem to have a weak argument with no clear and sound backup.
生成した回答は、明確で健全なバックアップのない弱い議論をしているように思われる。
0.63
However, improving reasoning and justification in the answers is not trivial because it involves sensible organization and presentation of ideas and the internal relevance of content.
The context of the ethical quandary is described in one or multiple paragraphs and hence not simple.
倫理的四項の文脈は1つまたは複数の段落で記述されるため、単純ではない。
0.57
We indirectly evaluated the model understanding of the input ethical quandary questions with "multi-
入力倫理的4次質問のモデル理解を「複数」で間接的に評価する。
0.50
英語(論文から抽出)
日本語訳
スコア
perspective" and "coherence" criteria from the preliminary investigation (Section 4) and main experiment (Section 5).
予備調査(第4部)及び主実験(第5部)における「展望」及び「一貫性」基準
0.82
Although the overall generated answers are consistent with context and question, it remains unclear if the model understood the context in depth because some generated answers are generic without the details of what is under discussion.
7 Related Work Machine Ethics and Ethical Question Answering Machine or AI ethics is an important emerging area of research (Hendrycks et al , 2020; Prabhumoye et al , 2020; Schramowski et al , 2021).
関連7件 機械倫理と倫理的質問回答機械(英語: Machine Ethics and Ethical Question Answering Machine, AI ethics)は、重要な研究分野である(Hendrycks et al , 2020; Prabhumoye et al , 2020; Schramowski et al , 2021)。
0.69
One line of current work focuses on improving machine understanding of human values and morality (e g , social norms, ethical judgment) through classification tasks (Forbes et al , 2020; Emelin et al , 2020; Lourie et al , 2021).
現在の研究の1行は、分類タスク(Forbes et al , 2020; Emelin et al , 2020; Lourie et al , 2021)を通して、人間の価値観と道徳(例えば、社会的規範、倫理的判断)の機械的理解の改善に焦点を当てている。 訳抜け防止モード: 人間の価値観と道徳の機械的理解の改善に焦点をあてた最近の研究の一線 (例:社会規範、倫理的判断) 分類タスク(Forbes et al, 2020; Emelin et al, 2020; Lourie et al)を通して 2021 ) .
0.87
Delphi (Jiang et al , 2021) is a research prototype to emulate human moral judgments based on training with the large dataset Commonsense Norm Bank, which includes the works mentioned above and other social norm understanding tasks (Sap et al , 2019).
Delphi (Jiang et al , 2021)は、上記のような社会的規範理解タスク(Sap et al , 2019)を含む大規模なデータセットであるCommonsense Norm Bankによるトレーニングに基づいて、人間の道徳的判断をエミュレートする研究プロトタイプである。
0.79
Delphi is trained to select “less contentious” choices in dealing with ethical questions or dilemmas.
delphiは倫理的な質問やジレンマに対処する上で,“不利な”選択を選択するように訓練されている。
0.48
However, Talat et al (2021) criticized that the model prediction of Delphi is based on average human values or skewed ethical values (western-centric), which is not necessarily the ideal approach and may be dangerous and misleading.
しかし、Talat et al (2021) は、デルフィのモデル予測は、必ずしも理想的なアプローチではなく、危険で誤解を招く可能性がある平均的な人的価値や歪んだ倫理的価値(西中心)に基づいていると批判した。
0.62
Meanwhile, Hendrycks et al (2020) proposes classifiers that explicitly provide the ethical perspective to be grounded against moral judgments (e g , utilitarianism, deontology, etc.).
一方、Hendrycks et al (2020) は倫理的視点を道徳的判断(例えば、実用主義、脱オントロジーなど)に基づいて明確に規定する分類法を提案している。
0.73
Their work focuses on clear-cut situations instead of ambiguous moral dilemmas.
彼らの作品は曖昧な道徳的ジレンマではなく、明確な状況に焦点を当てている。
0.39
Another line of work studies the reasoning capability with a “mental model” (Gu et al , 2021), but it again concludes with a yes-or-no judgment.
別の研究の行では、"メンタルモデル"(Gu et al , 2021)による推論能力について研究している。
0.58
In contrast, we attempt to understand the models’ ability to provide an answer in a manner of debate with explanations.
対照的に、我々はモデルが解答を提供する能力について、説明を伴う議論の方法で理解しようと試みる。
0.74
This approach can be seen as a Socratic way of dealing with the ethical issues in a deliberative manner, instead of being an oracle to give moral answers based on specific theories as to how traditional philosophers like Plato and Aristotle (Pincoffs, 1971).
Besides NLP and ML communities, the AI system involvement in the human moral decision-
NLPとMLのコミュニティに加えて、AIシステムが人間の道徳的決定に関与する-
0.77
making process in an ethical quandary situation has been actively discussed among moral philosophers.
倫理的四元論の状況におけるプロセスは 道徳哲学者の間で活発に議論されてきた
0.51
Some moral philosophers have suggested a direction where AI systems can be utilized positively and practically in such ethical quandary situations (Savulescu and Maslen, 2015; Giubilini and Savulescu, 2018; Lara and Deckers, 2020; Lara, 2021).
一部の道徳哲学者は、こうした倫理的4つの状況においてAIシステムがポジティブかつ実践的に活用できる方向を示唆している(Savulescu and Maslen, 2015; Giubilini and Savulescu, 2018; Lara and Deckers, 2020; Lara, 2021)。
0.78
They suggest the direction is where the AI system serves as a moral advisor that enhances an individual’s reflective process, so that humans can make better decision-making with a broader perspective while retaining the autonomy of their actions.
This aligns with our proposed vision on ethical quandary question answering through AISOCRATES, which focuses on providing a multiperspective for an ethical quandary.
Prompt-based Few-shot Learning with LLMs LLMs have shown their impressive ability as fewshot learners and enabled much simpler learning through prompt-based few-shot learning even in text-generation tasks (i.e., text summarization and machine translation) (Radford et al , 2019; Brown et al , 2020; Petroni et al , 2019).
プロンプトベースのFew-shot Learning with LLMs LLMsは、少人数の学習者としての彼らの印象的な能力を示し、テキスト生成タスク(テキストの要約や機械翻訳など)においても、よりシンプルな学習を可能にした(Radford et al , 2019; Brown et al , 2020; Petroni et al , 2019)。
0.75
Explorations in various tasks have been made, including dialogue generation (Madotto et al , 2020), factchecking (Lee et al , 2021), question answering (Jiang et al , 2020) and others (Reif et al , 2021; Schick and Schütze, 2020; Winata et al , 2021).
対話生成(Madotto et al , 2020)、ファクトチェック(Lee et al , 2021)、質問応答(Jiang et al , 2020)、質問応答(Reif et al , 2021; Schick and Schütze, 2020; Winata et al , 2021)など、様々なタスクの探索が行われた。
0.77
The prompt-based learning has several advantages, such as SOTA few-shot learning performance and no parameter tuning, but it also has disadvantages, such as the challenge of prompt engineering and its instability (Liu et al , 2021; Schick and Schütze, 2020).
プロンプトベースの学習には、SOTAによる数発の学習性能やパラメータチューニングなどいくつかの利点があるが、プロンプトエンジニアリングの課題や不安定性(Liu et al , 2021; Schick and Schütze, 2020)といった欠点もある。
0.73
The weaknesses of prompt-based learning are addressed in the literature, such as Kumar and Talukdar (2021); Shin et al (2020); Zhao et al (2021), which includes potential unsafe or unethical content generation.
急進的な学習の弱点は、KumarやTalukdar(2021年)、Shin et al(2020年)、Zhao et al(2021年)などの文献で指摘されている。 訳抜け防止モード: プロンプトベースの学習の弱点は、文献の中で取り扱われる。 例えば、kumar and talukdar (2021 ) ; shin et al (2020 ) ; zhao et al (2021 ) である。 潜在的に危険または非倫理的なコンテンツ生成を含む。
0.72
So far, no prior work has shown that few shot learning can incorporate explicit human values at the instruction level.
これまでのところ、ショット学習が指示レベルで明示的な人間的価値を取り入れられる例はほとんどない。
0.63
8 Conclusion In conclusion, we proposed AISOCRATES for the task of ETHICAL QUANDARY GQA , which answers ethical quandary questions from multiple perspectives based on different ethical principles.
According to a philosophical analysis, the model generates distinct answers based on ethical principles in paragraphs,
哲学的分析によれば、このモデルは段落の倫理的原則に基づいて異なる回答を生成する。
0.70
英語(論文から抽出)
日本語訳
スコア
although it sometimes lacks consistency in a generation.
しかし、世代で一貫性が欠如することもある。
0.53
Moreover, the full-automatic pipeline (with a model-based principle provider) is studied to understand the upper bound for research purposes, while the design choice of the human intervened model exists to guarantee the existence of agency for the generation.
The main experimental result shows that the full-automatic AISOCRATES provided multiple-perspective answers for 6.92% more often than answers written by philosophers.
主要な実験結果から、完全な自動AISOCRATESは哲学者の回答よりも6.92%多かった。
0.55
Furthermore, the results illustrate that the answers generated from our system still lack coherence and safety compared with philosopher-written answers, which highlights the need for more advanced methods for ETHICAL QUANDARY GQA.
Moreover, our assumption throughout the work was that the principles provided in our methodology are ethical principles or rule-of-thumbs that can lead to ethical/non-controve rsial/non-harmful advice while providing multiple perspectives.
When the “controversial/harmfu l” principles are intentionally provided to the system, there is a risk of generating corresponding harmful answers.
の原則がシステムに意図的に提供される場合、対応する有害な回答を発生させるリスクがあります。
0.69
Thus, the deployment of the system for an actual application should be thoroughly reviewed.
したがって、実際のアプリケーションに対するシステムのデプロイを徹底的にレビューする必要がある。
0.74
When we maintain human agency of such a system, humans need to be held responsible for their input.
このようなシステムの人間組織を維持するためには、人間が入力に対して責任を持つ必要がある。
0.67
Ethical Consideration Since the task of this work involves the topic of machine ethics and machine answers to the human ethical quandary, we pay extra attention to ethical responsibility and the impact of this work.
We want to clarify that experiment with full-automatic mode (with automatic principle selection) was an attempt to understand the model’s upper bound for research purposes but was not considered to be deployed for actual application without human agency (i.e., principle provider).
It is worth highlighting that AISOCRATES should not be considered as an oracle providing a definite answer but as a tool for providing multiple perspectives on ethical quandary questions.
At the same time, humans still hold autonomy in their actions.
同時に、人間は行動において自律性を持っている。
0.73
From the preliminary analysis of model generated answer, it is found that the model generated answer contains medical-practical information along with the redundancy or evaluation of the ethical quandary from a legal perspective.
Although this setup was with the existence of a human agent (principle provider), there is still ethical consideration we need to take when the final generation involves advice on such sensitive topics.
The factually wrong advice on sensitive topics such as medical/legal issues is not acceptable because it can result in severe impacts such as harm to the real users’ physical or mental health or legally unlawful decisions made by the users.
An actual potential application of our methodology for handling such safety concerns and the existence of human agency is to provide multiple angles regarding the ethical quandary, allowing the narrator to view their dilemma from different points of view.
References Michael Anderson and Susan Leigh Anderson.
マイケル・アンダーソン、スーザン・リー・アンダーソン。
0.52
2007. Machine ethics: Creating an ethical intelligent agent.
2007. マシン倫理: 倫理的な知的エージェントの創造。
0.55
AI magazine, 28(4):15–15.
アイ・マガジン、28:15-15。
0.47
Kenneth Bass, Tim Barnett, and Gene Brown.
ケネス・ベース、ティム・バーネット、ジーン・ブラウン。
0.54
1999. Individual difference variables, ethical judgments, and ethical behavioral intentions.
1999. 個人差分変数、倫理的判断、倫理的行動意図。
0.58
Business Ethics Quarterly, 9(2):183–205.
ビジネス倫理、9(2):183–205。
0.75
Cristina Bicchieri.
クリスティーナ・ビッキエリ
0.44
2005. The grammar of society: The nature and dynamics of social norms.
2005. 社会の文法:社会規範の性質とダイナミクス。
0.49
Cambridge University Press.
ケンブリッジ大学出版局。
0.68
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al 2020.
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al 2020 訳抜け防止モード: トム・ブラウン、ベンジャミン・マン、ニック・ライダー、メラニー・サブビア。 Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam Girish Sastry, Amanda Askell, et al 2020
0.82
Language models are few-shot learners.
言語モデルはわずかな学習者です。
0.69
Advances in neural information processing systems, 33:1877–1901.
神経情報処理システムの進歩、33:1877–1901。
0.72
Stephen Cave, Rune Nyrup, Karina Vold, and Adrian Weller.
スティーヴン・ケイブ、ルーン・ニラップ、カリナ・ボルド、エイドリアン・ウェラー。
0.52
2018. Motivations and risks of machine ethics.
2018. 機械倫理の動機とリスク。
0.51
Proceedings of the IEEE, 107(3):562–574.
ieeeの議事録、107(3):562–574。
0.60
Denis Emelin, Ronan Le Bras,
デニス・エメラン、ロナン・ル・ブラス。
0.46
Jena D Hwang, Maxwell Forbes, and Yejin Choi.
Jena D Hwang、Maxwell Forbes、Yejin Choi。
0.33
2020. Moral intents, stories: Situated reasoning about norms, arXiv preprint actions, and their consequences.
the AAAI Conference on Artificial Intelligence, volume 35, pages 13470–13479.
AAAI Conference on Artificial Intelligence, Volume 35, page 13470–13479.
0.44
Andrea Madotto, Zihan Liu, Zhaojiang Lin, and Pascale Fung.
Andrea Madotto、Zihan Liu、Zhaojiang Lin、Pascale Fung。
0.33
2020. Language models as few-shot learner for task-oriented dialogue systems.
2020. タスク指向対話システムのための最小ショット学習者としての言語モデル
0.50
arXiv preprint arXiv:2008.06239.
arXiv preprint arXiv:2008.06239。
0.63
Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller.
Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander Miller 訳抜け防止モード: Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis アントン・バクティン(Anton Bakhtin)、ユキサン・ウー(Yuxiang Wu)、アレクサンドル・ミラー(Alexander Miller)。
0.74
2019. Language models as knowlIn Proceedings of the 2019 Conferedge bases?
Shrimai Prabhumoye, Brendon Boldt, Ruslan SalakhutCase study: arXiv preprint
Shrimai Prabhumoye, Brendon Boldt, Ruslan SalakhutCase Study: arXiv preprint
0.42
dinov, and Alan W Black.
ディノフとアラン・w・ブラック
0.60
2020. Deontological ethics in nlp.
2020. nlpにおけるデオントロジー倫理
0.48
arXiv:2010.04658.
2010.04658。
0.30
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al 2019.
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al 2019 訳抜け防止モード: アレク・ラドフォード ジェフリー・ウー ルーオン・チャイルド デビッド・ルーアン dario amodei, ilya sutskever, et al 2019など。
0.58
Language models are unsupervised multitask learners.
言語モデルは教師なしマルチタスク学習者である。
0.60
OpenAI blog, 1(8):9.
OpenAIブログ、1(8):9。
0.81
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu.
コリン・ラフェル、ノーム・シャザー、アダム・ロバーツ、キャサリン・リー、シャラン・ナラン、マイケル・マテナ、ヤンチー・周、ウェイ・リー、ピーター・j・リュー。 訳抜け防止モード: Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li ピーター・J・リウ(Peter J Liu)。
0.89
2020. Exploring the limits of transfer learning with a unified text-to-text transformer.
2020. 統一テキスト-テキストトランスフォーマによるトランスファー学習の限界の検討
0.62
Journal of Machine Learning Research, 21:1–67.
Journal of Machine Learning Research, 21:1–67。
0.38
Emily Reif, Daphne Ippolito, Ann Yuan, Andy Coenen, Chris Callison-Burch, and Jason Wei.
In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
Victor Sanh, Albert Webson, Colin Raffel, Stephen H Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, et al 2021.
Victor Sanh, Albert Webson, Colin Raffel, Stephen H Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, et al 2021 訳抜け防止モード: ヴィクター・サン、アルバート・ウェブソン、コリン・ラフフェル、スティーブン・h・バッハ lintang sutawika, zaid alyafeai, antoine chaffin, arnaud stiegler 原題はteven le scao, arun raja, et al 2021。
0.66
Multitask prompted training enables zero-shot task generalization.
マルチタスク起動トレーニングは、ゼロショットタスクの一般化を可能にする。
0.45
arXiv preprint arXiv:2110.08207.
arXiv preprint arXiv:2110.08207
0.36
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A Smith, and Yejin Choi.
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A Smith, Yejin Choi
0.36
2019. Social bias frames: Reasoning about social and arXiv preprint power implications of language.
2021. Language models are few-shot multilingual learners.
2021. 言語モデルは、単発多言語学習者である。
0.51
arXiv preprint arXiv:2109.07684.
arXiv preprint arXiv:2109.07684
0.36
Tony Z Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh.
トニー・ザオ、エリック・ウォレス、シ・フェン、ダン・クライン、サマー・シン。
0.42
2021. Calibrate before use: Improving few-shot performance of language models.
2021. 使用前に校正する: 言語モデルの数少ないパフォーマンスを改善する。
0.56
arXiv preprint arXiv:2102.09690.
arXiv preprint arXiv:2102.09690
0.35
A Preliminary Case Study – Model
予備的なケーススタディ - モデル
0.81
Generations We conducted preliminary exploration to evaluate the model’s ability to generate distinct output answers based on different ethical principles.
We share some of example generated answers from AISOCRATES to test ethical quandary questions.
AISOCRATESによる倫理的質疑応答の例をいくつか紹介する。
0.56
Here, the ethical principles are automatically selected by our system and corresponding principlegrounded answer is generated.
ここでは、我々のシステムによって倫理原則が自動的に選択され、対応する原理的回答が生成される。
0.57
The examples can be found in Table 10 and Table 11.
例は Table 10 と Table 11 で見ることができる。
0.80
C Model Details Regarding the two-shots for the prompt-based learning, we use randomly selected samples from the train split.
Cモデル詳細 即発学習の2ショットについて、列車分割からランダムに選択したサンプルを用いる。
0.80
We need to extract the underlying principle of the NYT-Ethicist answer because the original data does not have a written principle on which the answer is based.
For model in all experiments in this paper, we choose one of the largest publicly available pre-trained LLMs – Jurassic-1 Jumbo (Lieber et al , 2021) with 178 billion parameters, which is based on the decoder module of the Transformer architecture (Vaswani et al , 2017).
本論文のすべての実験におけるモデルについて,1780億のパラメータを持つjurassic-1 jumbo (lieber et al , 2021) を,トランスフォーマアーキテクチャのデコーダモジュール(vaswani et al , 2017)に基づいて公開する,最大の事前学習済みllmの1つを選択する。
0.76
We process paragraphs with paragraph tags < p > and < /p > at the beginning and end of the paragraph, respectively, so the model can also learn paragraph writing.
I live alone and mostly keep to myself and stay home.
私は一人で住んでいて、ほとんどひとりで家にいる。
0.70
Because I work for a hospital, I am eligible to receive the vaccine along with people who are 65 and older, medically vulnerable people and corrections officers.
If I worked for any other employer, I wouldn’t be eligible to receive the vaccine until the second-to-last or last group of the population, probably not for several months.
Principle 1: You should place your own self-interest above the collective interest Answer based on Principle 1: Yes, it is ethical for you to take the vaccine.
It would be unethical for you to be selfish and refuse it.
君が利己的でそれを拒否するのは非倫理的だ。
0.52
The question is not whether you “deserve” the vaccine.
問題は、ワクチンを“望んで”いるかどうかではない。
0.56
After all, you work in a hospital, which is on the front lines of this pandemic.
結局のところ、あなたはこのパンデミックの最前線にある病院で働いています。
0.63
The question is whether your personal circumstances, including your age, your work and your isolation, make you so vulnerable that you’d be better off if you got sick.
In addition, your willingness to take vaccines as they become available may encourage your employer to vaccinate other staff members, making your workplace safer.
In some countries, the mask wearing has been mandatory with strict regulations.
一部の国ではマスク着用は厳格な規制で義務付けられている。
0.73
But, in some other countries, there is no clear law mandating mass making.
しかし、他の国では、大量生産を義務付ける明確な法律はない。
0.69
However, these different regulations make me confused.
しかし、これらの異なる規則によって私は混乱する。
0.57
I personally think I am healthy enough so I will not in a high-risk even if I get covid.
個人的には健康だと思うので、たとえまじめになってもハイリスクにはならない。
0.59
Currently, it is not mandatory to wear mask in where I live.
現在、私の住んでいる場所にマスクを着用する義務はない。
0.78
Here is my question.
これが私の質問です。
0.72
Should I wear a mask?
マスクを着るべきですか。
0.75
Principle 1: Action should be in the best interest of others rather than in one’s self-interest.
原則1: 行動は、自己利益よりもむしろ、他人の一番の利益であるべきである。
0.78
Answer based on Principle 1: It’s impossible to know whether you have the virus or will become infected, but we do know the virus is highly contagious.
That’s why, according to the Value, wearing masks should be an individual decision, not a collective one.
だからこそ、マスクをかぶるのは集団的な決定ではなく、個人の決定であるべきだ、というのだ。
0.66
The Value doesn’t provide a moral or practical basis for making public health decisions for other people.
価値は、他の人のために公衆衛生決定を行うための道徳的あるいは実践的な基盤を提供していません。
0.56
For example, someone might think that the Value allows them to be selfish.
例えば、価値によって自分勝手になると考える人もいます。
0.74
That’s certainly true. A person who believes that masks are unnecessary can choose to not wear one.
確かにその通りだ。 マスクが不要であると考える人は、着用しないことができる。
0.60
Others might believe that masks are a good idea.
マスクは良いアイデアだと考える人もいるかもしれない。
0.58
That’s also true. A person who believes a mask will help prevent infection can choose to wear a mask.
これも事実だ。 マスクが感染予防に役立つと信じている人は、マスクを選ぶことができます。
0.61
Table 8: Illustration of distinct model-generated answers to distinct principles provided.
表8: 異なる原則に対する明確なモデル生成の回答のイラスト。
0.83
英語(論文から抽出)
日本語訳
スコア
[Ethical Quandary] In many countries and regions across the world, the initial response to the massive health risks posed by COVID-19 has been the institution of lockdown measures.
Although they vary from place to place, these measures all involve trade-offs between ethical goods and imperatives, imposing significant restrictions on central human capabilities—including citizens’ ability to work, socialize, and access education—in the name of protecting population health.
Principle 1: A smaller sacrifice is morally justifiable for the greater good.
原則1: より小さな犠牲は、より大きな利益のために道徳的に正当化される。
0.63
Answer based on Principle 1: Ethically speaking, there’s only one choice here.
原則1に基づく回答: 倫理的に言えば、選択肢は1つだけです。
0.78
Yes, it’s justifiable, even admirable as an emergency measure.
そう、それは正当化できるし、緊急措置としても有望だ。
0.55
This pandemic is spreading so rapidly that it’s creating suffering on a vast scale, and that suffering is among the most abhorrent of all possible harms.
Hence, we must balance the imperative of saving lives now (the ethical good) against the pain and inconvenience that lockdown measures inflict on people who have to work, socialize, exercise democratic rights, and so on (the ethical impact), and use the most severe measures available to spare more lives.
I’ve discussed these questions with people in many countries, and I’ve noticed that people tend to regard the measures taken by their governments as the most stringent and intrusive—even cruel—relative to what others elsewhere are forced to endure.
Whatever one thinks of them, though, the most stringent measures are justified and ethically acceptable.
しかし、どう考えても、最も厳格な措置は正当化され、倫理的に受け入れられる。
0.65
Principle 2: Individual’s freedom should be considered as the top priority.
原則2: 個人の自由を最優先事項とするべきである。
0.72
Answer based on Principle 2: It’s not; but let’s not grade on a curve.
原則2:そうではありませんが、曲線のグレードではありません。
0.50
Anticipatory lockdowns, imposed before the virus’s spread is in full view, are notoriously tricky.
ウイルスの拡散が完全に見通す前に課せられる予想的なロックダウンは、悪名高いトリッキーだ。
0.60
But the ethics of such measures aren’t made any easier by two factors: First, the countries imposing the lockdowns—mostly democratic, in the developed world—have endured decades of ideological warfare against the state’s capacity to protect population health, in the name of individual freedom.
And second, the restrictions are imposed initially in those countries’ poorest neighborhoods and least-educated communities.
第二に、これらの国で最も貧しい地域や教育を受けていない地域社会では、規制が課せられる。
0.48
The U.S. government has enacted a federal moratorium on evictions, mortgage foreclosures, and utility disconnections, and many states and cities have followed.
provide for the children, they also have the obligation to act as a unit.
子供たちのために、彼らはユニットとして振る舞う義務も持っている。
0.66
This means the parents, as a unit, should address a serious problem in either the parents’ or the older children’s lives, for which one of them is responsible.
They have the obligation to help, not just as individuals, but as a unit, so that as many people as possible can solve the problem (or at least offer their support).
This would be true even if the parents were fundamentally opposed to the other’s behavior.
たとえ両親が基本的に相手の行動に反対していたとしても、これは事実だ。
0.63
You should address the issue to your daughter-in-law if she feels the need, but she is not obliged to tell you anything.
必要を感じているなら、その問題を義理の娘に伝えるべきだが、彼女には何も言う義務はない。
0.78
On the other hand, we can look into the situation with an ethical principle of "It is unethical to break confidentiality without the permission of the other."
In this situation, you would have to violate both principles.
この状況では、両方の原則に違反しなければなりません。
0.68
Bringing the problem to the attention of the parents without first discussing it with your daughter-in-law would violate the ethical value.
初めて義理の娘と話し合うことなく、両親の注意に問題をもたらすことは、倫理的価値に反する。
0.64
And even you still need to ask, not assume, that she would wish her parents to know.
そして、あなたでさえ、彼女が両親に知ってほしいと願うことを前提にしないで欲しい。
0.66
In such cases, it is rarely possible to respect both (and it’s rare that both principles would apply).
そのような場合、両方を尊重することは滅多にない(両方の原則が適用されることはまれである)。
0.74
You should probably keep your actions within the acceptable limits, which means discussing the issue with the daughter-in-law first, and only if that seems ineffective, then contacting her parents.
NYT Ethicist (Human Philosopher-written Answer): Because you’re not close to your in-laws, they’re not entitled to expect you to tell them things your stepson and his wife don’t want them to hear.
NYT Ethicist(Human Philosopher(Human Philosopher)が書いたAnswer): 義理の息子に近づかないから、義理の息子と彼の妻が聞きたくないことを話すことを期待する権利はない。
0.56
If he is a drug abuser and she is enabling him, though, neither may be a very good judge of the situation.
Let your concern for your grandchildren be your guide.
孫の心配をあなたのガイドにしましょう。
0.66
Suppose, after further discussion with your daughter-in-law, you decide that your stepson’s best chances for dealing with his addiction depend on concerted action by the in-laws as well as by you and your husband.
When people’s judgment is addled by addiction, their right to control information can be trumped by their best interests.
人の判断が中毒によって加えられると、情報を制御する権利は最善の利益によって脅かされる可能性がある。
0.64
Table 10: An illustration of the generated answer by AISOCRATES with automatically selected principles to an ethical quandary question test sample and its corresponding answer from NYT Ethicist (human-philosopher).
[Ethical Quandary] I’m a social worker in the adoption field.
[倫理第四紀]私は採用分野のソーシャルワーカーです。
0.46
A woman who surrendered her infant decades ago asked me for information about the child, and I gave her as much as I legally could, which is not very much.
Two days later, I picked up a magazine with an article about someone I recognized as this woman’s child; the article mentioned the birth name, which was given by the woman I spoke to.
What can I do? The birth mother knows what she named her child but may never read this particular article.
どうすればいいの? 母親は自分の子供の名前を知っているが、この記事は決して読めない。
0.77
I would never reveal confidential identities, but does pointing her toward a published piece carry the same weight?
秘密の身元は明かさないが、彼女を出版物に向けて指差すのも同じ重さなのか?
0.67
Automatically Selected Principles by AISOCRATES: <p1> The social worker should consider what would happen to the noncustodial birth parents and the children if this information was made public.
In your situation, divulging this identity would be a breach of confidentiality, but not illegal.
あなたの状況では、この身元を漏らすことは秘密の侵害になるが、違法ではない。
0.59
Imagine how the birth mother might feel if inadvertently she came across this article.
彼女がこの記事に出くわすと、母親がどう感じるか想像してみてほしい。
0.59
In such a case, you might explain to her that you were trying to help, that you were aware of the article and hoped she’d be able to maintain contact with her birth child.
From a different point, you should inform her immediately about the article.
別の点から言えば、その記事についてすぐに彼女に知らせるべきです。
0.64
Though she may not be as upset as you think she’d be, it would be her right to know.
彼女は、あなたが思うほど動揺していないかもしれないが、知る権利は彼女にあるだろう。
0.65
Neither this woman nor her child should ever feel they have been misled by the adoption or by the adoption agency.
この女性も彼女の子供は、彼らが養子縁組や養子縁組機関によって誤解されたと感じてはならない。
0.52
You may worry that telling the woman about this article is a breach of confidentiality and even of trust, and because this woman entrusted you with her child many years ago, that is a valid concern.
This woman must not go on living with the painful possibility that something terrible has happened to her child simply because you didn’t want to violate an arbitrary rule.
NYT Ethicist (Human Philosopher-written Answer): I don’t see an ethical problem with calling this woman and saying: “As a social worker, I am restricted by the state of New York in terms of what I can do to help. As a citizen of the world, however, I advise you find the most recent issue of XYZ magazine. There is an article on Page XYZ that may be of interest to you.”
(Otherwise, why mention the detail of the birth name to a reporter?)
(さて、なぜ生名を記者に詳しく述べるのか。)
0.52
The situation would be more complicated if you saw this article as an extension of your job (if this were a trade publication for adoption specialists, say), as that would bring professional ethics into play.
But it sounds as if you happened to read this magazine as a regular media consumer.
しかし、たまたまこの雑誌を普通のメディア消費者として読んでいたようだ。
0.79
You would not be helping this woman as a social worker; you would be helping her as an informed person who coincidentally learned something she wants to know.
What I see here is an aging mother who wants to find her biological offspring and an adult child who has publicly expressed interest about where he or she came from.
You are in the unique position to make that happen, using information that’s in the public record.
あなたは、公開記録にある情報を使って、それを実現するためのユニークな立場にいる。
0.75
Table 11: An illustration of the generated answer by AISOCRATES with automatically selected principles to an ethical quandary question test sample and its corresponding answer from NYT Ethicist (human-philosopher).
Either way, this privatized approach to mutual hazard will end in tears.
いずれにせよ、この民営化された相互危険へのアプローチは涙を流すことになる。
0.48
That’s why much of the world has abandoned it in favor of community-wide solutions.
そのため、世界中の人々がコミュニティ全体のソリューションに賛成してそれを放棄しています。
0.70
</p><p>For example, in London after the Great Fire of 1666, insurance companies formed private fire brigades to protect the property of paid subscribers, who marked their buildings with that company’s ensign.
The results were predictably grim, and London moved toward publicly financed municipal fire brigades.
結果は予想通り悲惨な結果となり、ロンドンは公的資金による市営消防団に向かって動いた。
0.55
</p><p>Even if you were indifferent to a neighbor’s misfortune, you should recognize that his flames can threaten your roof.
もし隣人の不幸に無関心だったとしても、彼の炎があなたの屋根を脅かす可能性があると認識すべきだ。
0.65
(And his cries of anguish can disturb your sleep.)
(また、苦悩の叫び声が睡眠を乱すこともある。)
0.68
If there is no political structure in place to allow a binding vote in your community and assuming your neighbors can afford to pay their share, try this temporary fix: After extinguishing a noncontributor’s conflagration, bill him ’ heavily ’ for your services.
In fact, if he’s around when his place bursts into flames, demand payment in advance, before you uncoil your hoses, as a civics lesson for the entire community.
</p>\n\n Question: My husband and I have been together for many years and have a teenage daughter; he also has a daughter from a prior relationship, who is a grown woman.
He has covered all my stepdaughter’s expenses from childhood, and her education, including living expenses and vacations.
彼は子供時代からの私の継娘の費用と、生活費や休暇を含む彼女の教育を全てカバーしてきた。
0.65
We are not rich but have a good income.
私たちは金持ちではないが、よい収入がある。
0.59
He is the main breadwinner. I have no assets or savings myself.
彼は大パン屋です。 私には資産も貯金もない。
0.61
Our house was his property when we met, and it remains his main asset.
私たちの家は私達が会った時彼の財産であり、彼の主要な資産です。
0.59
His will stipulates that I can live in the house until my own death if he dies first, but only his daughters actually inherit it.
彼の遺言は、彼が最初に死んだら自分の死までこの家に住めるが、実際に相続したのは娘だけである。
0.70
I hope my husband will live another 100 years, but should he die tomorrow, our daughter would have to sell her half of the house to afford the excellent, carefree education her sister is still receiving.
In addition, selling the house to provide for our daughter’s education would leave me homeless.
さらに、娘の教育を提供するために家を売れば、私はホームレスになるでしょう。
0.72
When I try to discuss this with my husband, he gets extremely angry.
私がこれを夫と話し合おうとすると、彼はひどく怒る。
0.66
He has even said that for our daughter to have to sell her part of the house would be O.K. I disagree.
彼は娘が彼女の家を売らなければならないのは o.k.だとさえ言ったが、私は反対だ。
0.70
I would like to create a trust fund or something similar so that if anything happens to him before our daughter reaches full independence, she can have an excellent education while preserving her inheritance.
Am I missing something, or would this be the fair thing to do?
何か欠落しているのか、それとも、これがフェアなことなのか?
0.52
\nPrinciple: It is important to have contingency plans which consider each family member’s welfare.
\nprinciple: 各家族の福祉を考慮した偶発的計画を持つことが重要です。
0.85
\nAnswer: According to the Principle, <p>Preparing for your own death can be unappealing, and your husband’s response, though unhelpful, isn’t unusual.
You might even propose couples counseling if he continues to resist.
彼が抵抗し続けるなら カップルのカウンセリングも提案できる
0.62
</p><p>This will, no doubt, make him angry, too.
</p><p>これは間違いなく彼も怒らせるだろう。
0.59
But it will show that you’re serious about settling this properly.
しかし、これを正しく定着させることに真剣であることは分かるだろう。
0.58
Though facing mortality is hard, we don’t buy time by making our deaths especially inconvenient to our loved ones.
死亡に直面することは難しいが、愛する人に特に不便な死をさせることで時間を稼ぐことはできない。
0.59
</p> Table 12: Illustration of Shot2 used for prompt-based few-shot learning.
</p> 表12: プロンプトベースの少数ショット学習に使用されるshot2のイラスト。
0.82
The training samples are all from New York Times Ethicist columns.
トレーニングサンプルはすべて、New York Times Ethicistコラムからのものです。
0.74
The principle is manually selected by human.
原則は人間が手動で選択する。
0.77
We process paragraphs with paragraph tags < p > and < /p > at the beginning and end of the paragraph, respectively, so the model can also learn paragraph writing.