Tiny, always-on and fragile: Bias propagation through design choices in
on-device machine learning workflows
- URL: http://arxiv.org/abs/2201.07677v1
- Date: Wed, 19 Jan 2022 15:59:41 GMT
- Title: Tiny, always-on and fragile: Bias propagation through design choices in
on-device machine learning workflows
- Authors: Wiebke Toussaint, Akhil Mathur, Aaron Yi Ding, Fahim Kawsar
- Abstract summary: We study the propagation of bias through design choices in on-device machine learning development.
We identify complex and interacting technical design choices that can lead to disparate performance across user groups.
We leverage our insights to suggest strategies for developers to develop fairer on-device ML.
- Score: 8.690490406134339
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Billions of distributed, heterogeneous and resource constrained smart
consumer devices deploy on-device machine learning (ML) to deliver private,
fast and offline inference on personal data. On-device ML systems are highly
context dependent, and sensitive to user, usage, hardware and environmental
attributes. Despite this sensitivity and the propensity towards bias in ML,
bias in on-device ML has not been studied. This paper studies the propagation
of bias through design choices in on-device ML development workflows. We
position \emph{reliablity bias}, which arises from disparate device failures
across demographic groups, as a source of unfairness in on-device ML settings
and quantify metrics to evaluate it. We then identify complex and interacting
technical design choices in the on-device ML workflow that can lead to
disparate performance across user groups, and thus \emph{reliability bias}.
Finally, we show with an empirical case study that seemingly innocuous design
choices such as the data sample rate, pre-processing parameters used to
construct input features and pruning hyperparameters propagate
\emph{reliability bias} through an audio keyword spotting development workflow.
We leverage our insights to suggest strategies for developers to develop fairer
on-device ML.
Related papers
- Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - On-device Online Learning and Semantic Management of TinyML Systems [8.183732025472766]
This study aims to bridge the gap between prototyping single TinyML models and developing reliable TinyML systems in production.
We propose online learning to enable training on constrained devices, adapting local models towards the latest field conditions.
We present semantic management for the joint management of models and devices at scale.
arXiv Detail & Related papers (2024-05-13T10:03:34Z) - On The Fairness Impacts of Hardware Selection in Machine Learning [50.03224106965757]
This paper investigates the influence of hardware on the delicate balance between model performance and fairness.
We demonstrate that hardware choices can exacerbate existing disparities, attributing these discrepancies to variations in gradient flows and loss surfaces across different demographic groups.
arXiv Detail & Related papers (2023-12-06T20:24:17Z) - Active Inference on the Edge: A Design Study [5.815300670677979]
Active Inference (ACI) is a concept from neuroscience that describes how the brain constantly predicts and evaluates sensory information to decrease long-term surprise.
We show how our ACI agent was able to quickly and traceably solve an optimization problem while fulfilling requirements.
arXiv Detail & Related papers (2023-11-17T16:03:04Z) - Closing the loop: Autonomous experiments enabled by
machine-learning-based online data analysis in synchrotron beamline
environments [80.49514665620008]
Machine learning can be used to enhance research involving large or rapidly generated datasets.
In this study, we describe the incorporation of ML into a closed-loop workflow for X-ray reflectometry (XRR)
We present solutions that provide an elementary data analysis in real time during the experiment without introducing the additional software dependencies in the beamline control software environment.
arXiv Detail & Related papers (2023-06-20T21:21:19Z) - Task-Oriented Over-the-Air Computation for Multi-Device Edge AI [57.50247872182593]
6G networks for supporting edge AI features task-oriented techniques that focus on effective and efficient execution of AI task.
Task-oriented over-the-air computation (AirComp) scheme is proposed in this paper for multi-device split-inference system.
arXiv Detail & Related papers (2022-11-02T16:35:14Z) - Federated Split GANs [12.007429155505767]
We propose an alternative approach to train ML models in user's devices themselves.
We focus on GANs (generative adversarial networks) and leverage their inherent privacy-preserving attribute.
Our system preserves data privacy, keeps a short training time, and yields same accuracy of model training in unconstrained devices.
arXiv Detail & Related papers (2022-07-04T23:53:47Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z) - Federated Learning-Based Risk-Aware Decision toMitigate Fake Task
Impacts on CrowdsensingPlatforms [9.925311092487851]
Mobile crowdsensing (MCS) leverages distributed and non-dedicated sensing concepts by utilizing sensors in a large number of mobile smart devices.
A malicious user submitting fake sensing tasks to an MCS platform may be attempting to consume resources from any number of participants' devices.
A novel approach is proposed to identify fake tasks that contain a number of independent detection devices and an aggregation entity.
arXiv Detail & Related papers (2021-01-04T22:43:24Z) - LiFT: A Scalable Framework for Measuring Fairness in ML Applications [18.54302159142362]
We present the LinkedIn Fairness Toolkit (LiFT), a framework for scalable computation of fairness metrics as part of large ML systems.
We discuss the challenges encountered in incorporating fairness tools in practice and the lessons learned during deployment at LinkedIn.
arXiv Detail & Related papers (2020-08-14T03:55:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.