Initializing Services in Interactive ML Systems for Diverse Users
- URL: http://arxiv.org/abs/2312.11846v1
- Date: Tue, 19 Dec 2023 04:26:12 GMT
- Title: Initializing Services in Interactive ML Systems for Diverse Users
- Authors: Avinandan Bose, Mihaela Curmei, Daniel L. Jiang, Jamie Morgenstern,
Sarah Dean, Lillian J.Ratliff, Maryam Fazel
- Abstract summary: We study ML systems that interactively learn from users across multiple subpopulations with heterogeneous data distributions.
We propose a randomized algorithm to adaptively select very few users to collect preference data from, while simultaneously initializing a set of services.
- Score: 29.445931639366325
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies ML systems that interactively learn from users across
multiple subpopulations with heterogeneous data distributions. The primary
objective is to provide specialized services for different user groups while
also predicting user preferences. Once the users select a service based on how
well the service anticipated their preference, the services subsequently adapt
and refine themselves based on the user data they accumulate, resulting in an
iterative, alternating minimization process between users and services
(learning dynamics). Employing such tailored approaches has two main
challenges: (i) Unknown user preferences: Typically, data on user preferences
are unavailable without interaction, and uniform data collection across a large
and diverse user base can be prohibitively expensive. (ii) Suboptimal Local
Solutions: The total loss (sum of loss functions across all users and all
services) landscape is not convex even if the individual losses on a single
service are convex, making it likely for the learning dynamics to get stuck in
local minima. The final outcome of the aforementioned learning dynamics is thus
strongly influenced by the initial set of services offered to users, and is not
guaranteed to be close to the globally optimal outcome. In this work, we
propose a randomized algorithm to adaptively select very few users to collect
preference data from, while simultaneously initializing a set of services. We
prove that under mild assumptions on the loss functions, the expected total
loss achieved by the algorithm right after initialization is within a factor of
the globally optimal total loss with complete user preference data, and this
factor scales only logarithmically in the number of services. Our theory is
complemented by experiments on real as well as semi-synthetic datasets.
Related papers
- A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance.
Data selection has shown promise in identifying the most representative samples from the entire dataset.
We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z) - Learning from Streaming Data when Users Choose [3.2429724835345692]
In digital markets comprised of many competing services, each user chooses between multiple service providers according to their preferences, and the chosen service makes use of the user data to incrementally improve its model.
The service providers' models influence which service the user will choose at the next time step, and the user's choice, in return, influences the model update, leading to a feedback loop.
We develop a simple and efficient decentralized algorithm to minimize the overall user loss.
arXiv Detail & Related papers (2024-06-03T16:07:52Z) - AAA: an Adaptive Mechanism for Locally Differential Private Mean Estimation [42.95927712062214]
Local differential privacy (LDP) is a strong privacy standard that has been adopted by popular software systems.
We propose the advanced adaptive additive (AAA) mechanism, which is a distribution-aware approach that addresses the average utility.
We provide rigorous privacy proofs, utility analyses, and extensive experiments comparing AAA with state-of-the-art mechanisms.
arXiv Detail & Related papers (2024-04-02T04:22:07Z) - Strategic Usage in a Multi-Learner Setting [4.810166064205261]
Real-world systems often involve some pool of users choosing between a set of services.
We analyze a setting in which strategic users choose among several available services in order to pursue positive classifications.
We show that naive retraining can still lead to oscillation even if all users are observed at different times.
arXiv Detail & Related papers (2024-01-29T18:59:22Z) - Mean Estimation with User-level Privacy under Data Heterogeneity [54.07947274508013]
Different users may possess vastly different numbers of data points.
It cannot be assumed that all users sample from the same underlying distribution.
We propose a simple model of heterogeneous user data that allows user data to differ in both distribution and quantity of data.
arXiv Detail & Related papers (2023-07-28T23:02:39Z) - Sample-Efficient Personalization: Modeling User Parameters as Low Rank
Plus Sparse Components [30.32486162748558]
Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation systems.
We propose a novel meta-learning style approach that models network weights as a sum of low-rank and sparse components.
We show that AMHT-LRS solves the problem efficiently with nearly optimal sample complexity.
arXiv Detail & Related papers (2022-10-07T12:50:34Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions [74.00030431081751]
We formalize the notion of user-specific cost functions and introduce a new method for identifying actionable recourses for users.
Our method satisfies up to 25.89 percentage points more users compared to strong baseline methods.
arXiv Detail & Related papers (2021-11-01T19:49:35Z) - Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning.
It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers.
Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z) - Personalized Federated Learning: A Meta-Learning Approach [28.281166755509886]
In Federated Learning, we aim to train models across multiple computing units (users)
In this paper, we study a personalized variant of the federated learning in which our goal is to find an initial shared model that current or new users can easily adapt to their local dataset by performing one or a few steps of gradient descent with respect to their own data.
arXiv Detail & Related papers (2020-02-19T01:08:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.