A Quantitative Account of Harm
- URL: http://arxiv.org/abs/2209.15111v1
- Date: Thu, 29 Sep 2022 21:48:38 GMT
- Title: A Quantitative Account of Harm
- Authors: Sander Beckers, Hana Chockler, Joseph Y. Halpern
- Abstract summary: We first present a quantitative definition of harm in a deterministic context involving a single individual.
We then consider the issues involved in dealing with uncertainty regarding the context.
We show that the "obvious" way of doing this can lead to counterintuitive or inappropriate answers.
- Score: 18.7822411439221
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a companion paper (Beckers et al. 2022), we defined a qualitative notion
of harm: either harm is caused, or it is not. For practical applications, we
often need to quantify harm; for example, we may want to choose the lest
harmful of a set of possible interventions. We first present a quantitative
definition of harm in a deterministic context involving a single individual,
then we consider the issues involved in dealing with uncertainty regarding the
context and going from a notion of harm for a single individual to a notion of
"societal harm", which involves aggregating the harm to individuals. We show
that the "obvious" way of doing this (just taking the expected harm for an
individual and then summing the expected harm over all individuals can lead to
counterintuitive or inappropriate answers, and discuss alternatives, drawing on
work from the decision-theory literature.
Related papers
- Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights [50.89022445197919]
We propose a speech-specific risk taxonomy, covering 8 risk categories under hostility (malicious sarcasm and threats), malicious imitation (age, gender, ethnicity), and stereotypical biases (age, gender, ethnicity)
Based on the taxonomy, we create a small-scale dataset for evaluating current LMMs capability in detecting these categories of risk.
arXiv Detail & Related papers (2024-06-25T10:08:45Z) - Fairness-Accuracy Trade-Offs: A Causal Perspective [58.06306331390586]
We analyze the tension between fairness and accuracy from a causal lens for the first time.
We show that enforcing a causal constraint often reduces the disparity between demographic groups.
We introduce a new neural approach for causally-constrained fair learning.
arXiv Detail & Related papers (2024-05-24T11:19:52Z) - Beyond Behaviorist Representational Harms: A Plan for Measurement and Mitigation [1.7355698649527407]
This study focuses on an examination of current definitions of representational harms to discern what is included and what is not.
Our work highlights the unique vulnerabilities of large language models to perpetrating representational harms.
The overarching aim of this research is to establish a framework for broadening the definition of representational harms.
arXiv Detail & Related papers (2024-01-25T00:54:10Z) - A Causal Analysis of Harm [18.7822411439221]
There is a growing need for a legal and regulatory framework to address when and how autonomous systems harm someone.
This paper formally defines a qualitative notion of harm that uses causal models and is based on a well-known definition of actual causality.
We show that our definition is able to handle the examples from the literature, and illustrate its importance for reasoning about situations involving autonomous systems.
arXiv Detail & Related papers (2022-10-11T10:36:24Z) - Handling and Presenting Harmful Text [10.359716317114815]
Textual data can pose a risk of serious harm.
These harms can be categorised along three axes: misinformation, hate speech or racial stereotypes.
It is an unsolved problem in NLP as to how textual harms should be handled, presented, and discussed.
We provide practical advice and introduce textscHarmCheck, a resource for reflecting on research into textual harms.
arXiv Detail & Related papers (2022-04-29T17:34:12Z) - First do no harm: counterfactual objective functions for safe & ethical
AI [0.03683202928838612]
We develop the first statistical definition of harm and a framework for factoring harm into algorithmic decisions.
Our results show that counterfactual reasoning is a key ingredient for safe and ethical AI.
arXiv Detail & Related papers (2022-04-27T15:03:43Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Detecting Inappropriate Messages on Sensitive Topics that Could Harm a
Company's Reputation [64.22895450493729]
A calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities.
We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness.
arXiv Detail & Related papers (2021-03-09T10:50:30Z) - Overcoming Failures of Imagination in AI Infused System Development and
Deployment [71.9309995623067]
NeurIPS 2020 requested that research paper submissions include impact statements on "potential nefarious uses and the consequences of failure"
We argue that frameworks of harms must be context-aware and consider a wider range of potential stakeholders, system affordances, as well as viable proxies for assessing harms in the widest sense.
arXiv Detail & Related papers (2020-11-26T18:09:52Z) - Aligning Faithful Interpretations with their Social Attribution [58.13152510843004]
We find that the requirement of model interpretations to be faithful is vague and incomplete.
We identify that the problem is a misalignment between the causal chain of decisions (causal attribution) and the attribution of human behavior to the interpretation (social attribution)
arXiv Detail & Related papers (2020-06-01T16:45:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.