Considerations for Ethical Speech Recognition Datasets
- URL: http://arxiv.org/abs/2305.02081v1
- Date: Wed, 3 May 2023 12:38:14 GMT
- Title: Considerations for Ethical Speech Recognition Datasets
- Authors: Orestis Papakyriakopoulos, Alice Xiang
- Abstract summary: We use automatic speech recognition as a case study and examine the properties that ethical speech datasets should possess towards responsible AI applications.
We showcase diversity issues, inclusion practices, and necessary considerations that can improve trained models.
We argue for the legal & privacy protection of data subjects, targeted data sampling corresponding to user demographics & needs, appropriate meta data that ensure explainability & accountability in cases of model failure.
- Score: 0.799536002595393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Speech AI Technologies are largely trained on publicly available datasets or
by the massive web-crawling of speech. In both cases, data acquisition focuses
on minimizing collection effort, without necessarily taking the data subjects'
protection or user needs into consideration. This results to models that are
not robust when used on users who deviate from the dominant demographics in the
training set, discriminating individuals having different dialects, accents,
speaking styles, and disfluencies. In this talk, we use automatic speech
recognition as a case study and examine the properties that ethical speech
datasets should possess towards responsible AI applications. We showcase
diversity issues, inclusion practices, and necessary considerations that can
improve trained models, while facilitating model explainability and protecting
users and data subjects. We argue for the legal & privacy protection of data
subjects, targeted data sampling corresponding to user demographics & needs,
appropriate meta data that ensure explainability & accountability in cases of
model failure, and the sociotechnical \& situated model design. We hope this
talk can inspire researchers \& practitioners to design and use more
human-centric datasets in speech technologies and other domains, in ways that
empower and respect users, while improving machine learning models' robustness
and utility.
Related papers
- Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models [0.65268245109828]
We introduce the notion of contextual diversity for active learning CDAL.
We propose a data repair algorithm to curate contextually fair data to reduce model bias.
We are working on developing image retrieval system for wildlife camera trap images and reliable warning system for poor quality rural roads.
arXiv Detail & Related papers (2024-11-04T09:43:33Z) - Speech Emotion Recognition under Resource Constraints with Data Distillation [64.36799373890916]
Speech emotion recognition (SER) plays a crucial role in human-computer interaction.
The emergence of edge devices in the Internet of Things presents challenges in constructing intricate deep learning models.
We propose a data distillation framework to facilitate efficient development of SER models in IoT applications.
arXiv Detail & Related papers (2024-06-21T13:10:46Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Efficiency-oriented approaches for self-supervised speech representation
learning [1.860144985630098]
Self-supervised learning enables the training of large neural models without the need for large, labeled datasets.
It has been generating breakthroughs in several fields, including computer vision, natural language processing, biology, and speech.
Despite current efforts, more work could be done to address high computational costs in self-supervised representation learning.
arXiv Detail & Related papers (2023-12-18T12:32:42Z) - Augmented Datasheets for Speech Datasets and Ethical Decision-Making [2.7106766103546236]
Speech datasets are crucial for training Speech Language Technologies (SLT)
Lack of diversity of the underlying training data can lead to serious limitations in building equitable and robust SLT products.
There is often a lack of oversight on the underlying training data with regard to the ethics of such data collection.
arXiv Detail & Related papers (2023-05-08T12:49:04Z) - Text is All You Need: Personalizing ASR Models using Controllable Speech
Synthesis [17.172909510518814]
Adapting generic speech recognition models to specific individuals is a challenging problem due to the scarcity of personalized data.
Recent works have proposed boosting the amount of training data using personalized text-to-speech synthesis.
arXiv Detail & Related papers (2023-03-27T02:50:02Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - SF-PATE: Scalable, Fair, and Private Aggregation of Teacher Ensembles [50.90773979394264]
This paper studies a model that protects the privacy of individuals' sensitive information while also allowing it to learn non-discriminatory predictors.
A key characteristic of the proposed model is to enable the adoption of off-the-selves and non-private fair models to create a privacy-preserving and fair model.
arXiv Detail & Related papers (2022-04-11T14:42:54Z) - Attribute Inference Attack of Speech Emotion Recognition in Federated
Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data.
We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters.
We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z) - Differentially Private and Fair Deep Learning: A Lagrangian Dual
Approach [54.32266555843765]
This paper studies a model that protects the privacy of the individuals sensitive information while also allowing it to learn non-discriminatory predictors.
The method relies on the notion of differential privacy and the use of Lagrangian duality to design neural networks that can accommodate fairness constraints.
arXiv Detail & Related papers (2020-09-26T10:50:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.