Data Measurements for Decentralized Data Markets
- URL: http://arxiv.org/abs/2406.04257v1
- Date: Thu, 6 Jun 2024 17:03:51 GMT
- Title: Data Measurements for Decentralized Data Markets
- Authors: Charles Lu, Mohammad Mohammadi Amiri, Ramesh Raskar,
- Abstract summary: Decentralized data markets can provide more equitable forms of data acquisition for machine learning.
We propose and benchmark federated data measurements to allow a data buyer to find sellers with relevant and diverse datasets.
- Score: 18.99870296998749
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decentralized data markets can provide more equitable forms of data acquisition for machine learning. However, to realize practical marketplaces, efficient techniques for seller selection need to be developed. We propose and benchmark federated data measurements to allow a data buyer to find sellers with relevant and diverse datasets. Diversity and relevance measures enable a buyer to make relative comparisons between sellers without requiring intermediate brokers and training task-dependent models.
Related papers
- A Comprehensive Survey on Data Augmentation [55.355273602421384]
Data augmentation is a technique that generates high-quality artificial data by manipulating existing data samples.
Existing literature surveys only focus on a certain type of specific modality data.
We propose a more enlightening taxonomy that encompasses data augmentation techniques for different common data modalities.
arXiv Detail & Related papers (2024-05-15T11:58:08Z) - Data Acquisition via Experimental Design for Decentralized Data Markets [25.300193837833426]
Data markets provide a way to increase the supply of data, particularly in data-scarce domains such as healthcare.
A major challenge for a data buyer in such a market is selecting the most valuable data points from a data seller.
We propose a federated approach to the data selection problem that is inspired by linear experimental design.
arXiv Detail & Related papers (2024-03-20T18:05:52Z) - A Bargaining-based Approach for Feature Trading in Vertical Federated
Learning [54.51890573369637]
We propose a bargaining-based feature trading approach in Vertical Federated Learning (VFL) to encourage economically efficient transactions.
Our model incorporates performance gain-based pricing, taking into account the revenue-based optimization objectives of both parties.
arXiv Detail & Related papers (2024-02-23T10:21:07Z) - Data Acquisition: A New Frontier in Data-centric AI [65.90972015426274]
We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets.
We then introduce the DAM challenge, a benchmark to model the interaction between the data providers and acquirers.
Our evaluation of the submitted strategies underlines the need for effective data acquisition strategies in Machine Learning.
arXiv Detail & Related papers (2023-11-22T22:15:17Z) - Addressing Budget Allocation and Revenue Allocation in Data Market
Environments Using an Adaptive Sampling Algorithm [14.206050847214652]
We introduce a new algorithm to solve budget allocation and revenue allocation problems simultaneously in linear time.
The new algorithm employs an adaptive sampling process that selects data from those providers who are contributing the most to the model.
We provide theoretical guarantees for the algorithm that show the budget is used efficiently and the properties of revenue allocation are similar to Shapley's.
arXiv Detail & Related papers (2023-06-05T02:28:19Z) - A Survey of Data Pricing for Data Marketplaces [77.3189288320768]
This paper attempts to comprehensively review the state-of-the-art on existing data pricing studies.
Our key contribution lies in a new taxonomy of data pricing studies that unifies different attributes determining data prices.
arXiv Detail & Related papers (2023-03-07T04:35:56Z) - Fundamentals of Task-Agnostic Data Valuation [21.78555506720078]
We study valuing the data of a data owner/seller for a data seeker/buyer.
We focus on task-agnostic data valuation without any validation requirements.
arXiv Detail & Related papers (2022-08-25T22:07:07Z) - Data Sharing Markets [95.13209326119153]
We study a setup where each agent can be both buyer and seller of data.
We consider two cases: bilateral data exchange (trading data with data) and unilateral data exchange (trading data with money)
arXiv Detail & Related papers (2021-07-19T06:00:34Z) - OSOUM Framework for Trading Data Research [79.0383470835073]
We supply, to the best of our knowledge, the first open source simulation platform, Open SOUrce Market Simulator (OSOUM) to analyze trading markets and specifically data markets.
We describe and implement a specific data market model, consisting of two types of agents: sellers who own various datasets available for acquisition, and buyers searching for relevant and beneficial datasets for purchase.
Although commercial frameworks, intended for handling data markets, already exist, we provide a free and extensive end-to-end research tool for simulating possible behavior for both buyers and sellers participating in (data) markets.
arXiv Detail & Related papers (2021-02-18T09:20:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.