Related papers: Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows

Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows

URL: http://arxiv.org/abs/2201.07677v1
Date: Wed, 19 Jan 2022 15:59:41 GMT
Title: Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows
Authors: Wiebke Toussaint, Akhil Mathur, Aaron Yi Ding, Fahim Kawsar
Abstract summary: We study the propagation of bias through design choices in on-device machine learning development. We identify complex and interacting technical design choices that can lead to disparate performance across user groups. We leverage our insights to suggest strategies for developers to develop fairer on-device ML.
Score: 8.690490406134339
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Billions of distributed, heterogeneous and resource constrained smart consumer devices deploy on-device machine learning (ML) to deliver private, fast and offline inference on personal data. On-device ML systems are highly context dependent, and sensitive to user, usage, hardware and environmental attributes. Despite this sensitivity and the propensity towards bias in ML, bias in on-device ML has not been studied. This paper studies the propagation of bias through design choices in on-device ML development workflows. We position \emph{reliablity bias}, which arises from disparate device failures across demographic groups, as a source of unfairness in on-device ML settings and quantify metrics to evaluate it. We then identify complex and interacting technical design choices in the on-device ML workflow that can lead to disparate performance across user groups, and thus \emph{reliability bias}. Finally, we show with an empirical case study that seemingly innocuous design choices such as the data sample rate, pre-processing parameters used to construct input features and pruning hyperparameters propagate \emph{reliability bias} through an audio keyword spotting development workflow. We leverage our insights to suggest strategies for developers to develop fairer on-device ML.

Related papers

Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment [49.81946749379338]
This work seeks to analyze the capacity of Transformers-based systems to learn demographic biases present in the data.<n>We propose a privacy-enhancing framework to reduce gender information from the learning pipeline as a way to mitigate biased behaviors in the final tools.
arXiv Detail & Related papers (2025-06-13T15:29:43Z)
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space.<n>MeCo is fine-tuning-free and incurs minimal cost.
arXiv Detail & Related papers (2025-02-18T15:45:01Z)
A Systematic Review of Machine Learning Approaches for Detecting Deceptive Activities on Social Media: Methods, Challenges, and Biases [0.037693031068634524]
This systematic review evaluates studies that apply machine learning (ML) and deep learning (DL) models to detect fake news, spam, and fake accounts on social media.
arXiv Detail & Related papers (2024-10-26T23:55:50Z)
Improved Diversity-Promoting Collaborative Metric Learning for Recommendation [127.08043409083687]
Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems. This paper focuses on a challenging scenario where a user has multiple categories of interests. We propose a novel method called textitDiversity-Promoting Collaborative Metric Learning (DPCML)
arXiv Detail & Related papers (2024-09-02T07:44:48Z)
Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z)
On-device Online Learning and Semantic Management of TinyML Systems [8.183732025472766]
This study aims to bridge the gap between prototyping single TinyML models and developing reliable TinyML systems in production. We propose online learning to enable training on constrained devices, adapting local models towards the latest field conditions. We present semantic management for the joint management of models and devices at scale.
arXiv Detail & Related papers (2024-05-13T10:03:34Z)
Active Inference on the Edge: A Design Study [5.815300670677979]
Active Inference (ACI) is a concept from neuroscience that describes how the brain constantly predicts and evaluates sensory information to decrease long-term surprise. We show how our ACI agent was able to quickly and traceably solve an optimization problem while fulfilling requirements.
arXiv Detail & Related papers (2023-11-17T16:03:04Z)
Closing the loop: Autonomous experiments enabled by machine-learning-based online data analysis in synchrotron beamline environments [80.49514665620008]
Machine learning can be used to enhance research involving large or rapidly generated datasets. In this study, we describe the incorporation of ML into a closed-loop workflow for X-ray reflectometry (XRR) We present solutions that provide an elementary data analysis in real time during the experiment without introducing the additional software dependencies in the beamline control software environment.
arXiv Detail & Related papers (2023-06-20T21:21:19Z)
Task-Oriented Over-the-Air Computation for Multi-Device Edge AI [57.50247872182593]
6G networks for supporting edge AI features task-oriented techniques that focus on effective and efficient execution of AI task. Task-oriented over-the-air computation (AirComp) scheme is proposed in this paper for multi-device split-inference system.
arXiv Detail & Related papers (2022-11-02T16:35:14Z)
Federated Split GANs [12.007429155505767]
We propose an alternative approach to train ML models in user's devices themselves. We focus on GANs (generative adversarial networks) and leverage their inherent privacy-preserving attribute. Our system preserves data privacy, keeps a short training time, and yields same accuracy of model training in unconstrained devices.
arXiv Detail & Related papers (2022-07-04T23:53:47Z)
Task-Oriented Sensing, Computation, and Communication Integration for Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC) We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z)
Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference. We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space. Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z)
LiFT: A Scalable Framework for Measuring Fairness in ML Applications [18.54302159142362]
We present the LinkedIn Fairness Toolkit (LiFT), a framework for scalable computation of fairness metrics as part of large ML systems. We discuss the challenges encountered in incorporating fairness tools in practice and the lessons learned during deployment at LinkedIn.
arXiv Detail & Related papers (2020-08-14T03:55:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.