Related papers: Initializing Services in Interactive ML Systems for Diverse Users

Initializing Services in Interactive ML Systems for Diverse Users

URL: http://arxiv.org/abs/2312.11846v1
Date: Tue, 19 Dec 2023 04:26:12 GMT
Title: Initializing Services in Interactive ML Systems for Diverse Users
Authors: Avinandan Bose, Mihaela Curmei, Daniel L. Jiang, Jamie Morgenstern, Sarah Dean, Lillian J.Ratliff, Maryam Fazel
Abstract summary: We study ML systems that interactively learn from users across multiple subpopulations with heterogeneous data distributions. We propose a randomized algorithm to adaptively select very few users to collect preference data from, while simultaneously initializing a set of services.
Score: 29.445931639366325
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper studies ML systems that interactively learn from users across multiple subpopulations with heterogeneous data distributions. The primary objective is to provide specialized services for different user groups while also predicting user preferences. Once the users select a service based on how well the service anticipated their preference, the services subsequently adapt and refine themselves based on the user data they accumulate, resulting in an iterative, alternating minimization process between users and services (learning dynamics). Employing such tailored approaches has two main challenges: (i) Unknown user preferences: Typically, data on user preferences are unavailable without interaction, and uniform data collection across a large and diverse user base can be prohibitively expensive. (ii) Suboptimal Local Solutions: The total loss (sum of loss functions across all users and all services) landscape is not convex even if the individual losses on a single service are convex, making it likely for the learning dynamics to get stuck in local minima. The final outcome of the aforementioned learning dynamics is thus strongly influenced by the initial set of services offered to users, and is not guaranteed to be close to the globally optimal outcome. In this work, we propose a randomized algorithm to adaptively select very few users to collect preference data from, while simultaneously initializing a set of services. We prove that under mild assumptions on the loss functions, the expected total loss achieved by the algorithm right after initialization is within a factor of the globally optimal total loss with complete user preference data, and this factor scales only logarithmically in the number of services. Our theory is complemented by experiments on real as well as semi-synthetic datasets.

Related papers

A Differentiable Adversarial Framework for Task-Aware Data Subsampling [0.5371337604556311]
We introduce the antagonistic soft selection subsampling (ASSS) framework as a novel paradigm that reconstructs data reduction into a differentiable end-to-end learning problem.<n>This work establishes task aware data subsampling as a learnable component, providing a principled solution for effective large-scale data learning.
arXiv Detail & Related papers (2026-01-05T13:10:09Z)
Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private. We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z)
A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance. Data selection has shown promise in identifying the most representative samples from the entire dataset. We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z)
Learning from Streaming Data when Users Choose [3.2429724835345692]
In digital markets comprised of many competing services, each user chooses between multiple service providers according to their preferences, and the chosen service makes use of the user data to incrementally improve its model. The service providers' models influence which service the user will choose at the next time step, and the user's choice, in return, influences the model update, leading to a feedback loop. We develop a simple and efficient decentralized algorithm to minimize the overall user loss.
arXiv Detail & Related papers (2024-06-03T16:07:52Z)
AAA: an Adaptive Mechanism for Locally Differential Private Mean Estimation [42.95927712062214]
Local differential privacy (LDP) is a strong privacy standard that has been adopted by popular software systems. We propose the advanced adaptive additive (AAA) mechanism, which is a distribution-aware approach that addresses the average utility. We provide rigorous privacy proofs, utility analyses, and extensive experiments comparing AAA with state-of-the-art mechanisms.
arXiv Detail & Related papers (2024-04-02T04:22:07Z)
Strategic Usage in a Multi-Learner Setting [4.810166064205261]
Real-world systems often involve some pool of users choosing between a set of services. We analyze a setting in which strategic users choose among several available services in order to pursue positive classifications. We show that naive retraining can still lead to oscillation even if all users are observed at different times.
arXiv Detail & Related papers (2024-01-29T18:59:22Z)
User-Level Differential Privacy With Few Examples Per User [73.81862394073308]
We consider the example-scarce regime, where each user has only a few examples, and obtain the following results. For approximate-DP, we give a generic transformation of any item-level DP algorithm to a user-level DP algorithm. We present a simple technique for adapting the exponential mechanism [McSherry, Talwar FOCS 2007] to the user-level setting.
arXiv Detail & Related papers (2023-09-21T21:51:55Z)
Mean Estimation with User-level Privacy under Data Heterogeneity [54.07947274508013]
Different users may possess vastly different numbers of data points. It cannot be assumed that all users sample from the same underlying distribution. We propose a simple model of heterogeneous user data that allows user data to differ in both distribution and quantity of data.
arXiv Detail & Related papers (2023-07-28T23:02:39Z)
Sample-Efficient Personalization: Modeling User Parameters as Low Rank Plus Sparse Components [30.32486162748558]
Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation systems. We propose a novel meta-learning style approach that models network weights as a sum of low-rank and sparse components. We show that AMHT-LRS solves the problem efficiently with nearly optimal sample complexity.
arXiv Detail & Related papers (2022-10-07T12:50:34Z)
The Minority Matters: A Diversity-Promoting Collaborative Metric Learning Algorithm [154.47590401735323]
Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems. This paper focuses on a challenging scenario where a user has multiple categories of interests. We propose a novel method called textitDiversity-Promoting Collaborative Metric Learning (DPCML)
arXiv Detail & Related papers (2022-09-30T08:02:18Z)
Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions. We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles. Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions [74.00030431081751]
We formalize the notion of user-specific cost functions and introduce a new method for identifying actionable recourses for users. Our method satisfies up to 25.89 percentage points more users compared to strong baseline methods.
arXiv Detail & Related papers (2021-11-01T19:49:35Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning. It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers. Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z)
Personalized Federated Learning: A Meta-Learning Approach [28.281166755509886]
In Federated Learning, we aim to train models across multiple computing units (users) In this paper, we study a personalized variant of the federated learning in which our goal is to find an initial shared model that current or new users can easily adapt to their local dataset by performing one or a few steps of gradient descent with respect to their own data.
arXiv Detail & Related papers (2020-02-19T01:08:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.