Data Pricing for Graph Neural Networks without Pre-purchased Inspection
- URL: http://arxiv.org/abs/2502.08284v1
- Date: Wed, 12 Feb 2025 10:42:04 GMT
- Title: Data Pricing for Graph Neural Networks without Pre-purchased Inspection
- Authors: Yiping Liu, Mengxiao Zhang, Jiamou Liu, Song Yang,
- Abstract summary: Model marketplaces leverage model trading mechanisms to properly incentive data owners to contribute their data.<n>We propose a novel mechanism, named Structural Importance based Model Trading (SIMT) mechanism, that assesses the data importance and compensates data owners accordingly.<n>SIMT consistently outperforms vanilla baselines by up to $40%$ in both MacroF1 and MicroF1.
- Score: 15.556650640576311
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) models have become essential tools in various scenarios. Their effectiveness, however, hinges on a substantial volume of data for satisfactory performance. Model marketplaces have thus emerged as crucial platforms bridging model consumers seeking ML solutions and data owners possessing valuable data. These marketplaces leverage model trading mechanisms to properly incentive data owners to contribute their data, and return a well performing ML model to the model consumers. However, existing model trading mechanisms often assume the data owners are willing to share their data before being paid, which is not reasonable in real world. Given that, we propose a novel mechanism, named Structural Importance based Model Trading (SIMT) mechanism, that assesses the data importance and compensates data owners accordingly without disclosing the data. Specifically, SIMT procures feature and label data from data owners according to their structural importance, and then trains a graph neural network for model consumers. Theoretically, SIMT ensures incentive compatible, individual rational and budget feasible. The experiments on five popular datasets validate that SIMT consistently outperforms vanilla baselines by up to $40\%$ in both MacroF1 and MicroF1.
Related papers
- DUPRE: Data Utility Prediction for Efficient Data Valuation [49.60564885180563]
Cooperative game theory-based data valuation, such as Data Shapley, requires evaluating the data utility and retraining the ML model for multiple data subsets.
Our framework, textttDUPRE, takes an alternative yet complementary approach that reduces the cost per subset evaluation by predicting data utilities instead of evaluating them by model retraining.
Specifically, given the evaluated data utilities of some data subsets, textttDUPRE fits a emphGaussian process (GP) regression model to predict the utility of every other data subset.
arXiv Detail & Related papers (2025-02-22T08:53:39Z) - DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data [61.62554324594797]
We propose DreamMask, which explores how to generate training data in the open-vocabulary setting, and how to train the model with both real and synthetic data.
In general, DreamMask significantly simplifies the collection of large-scale training data, serving as a plug-and-play enhancement for existing methods.
For instance, when trained on COCO and tested on ADE20K, the model equipped with DreamMask outperforms the previous state-of-the-art by a substantial margin of 2.1% mIoU.
arXiv Detail & Related papers (2025-01-03T19:00:00Z) - Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality.<n>We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - IMFL-AIGC: Incentive Mechanism Design for Federated Learning Empowered by Artificial Intelligence Generated Content [15.620004060097155]
Federated learning (FL) has emerged as a promising paradigm that enables clients to collaboratively train a shared global model without uploading their local data.
We propose a data quality-aware incentive mechanism to encourage clients' participation.
Our proposed mechanism exhibits highest training accuracy and reduces up to 53.34% of the server's cost with real-world datasets.
arXiv Detail & Related papers (2024-06-12T07:47:22Z) - A Bargaining-based Approach for Feature Trading in Vertical Federated
Learning [54.51890573369637]
We propose a bargaining-based feature trading approach in Vertical Federated Learning (VFL) to encourage economically efficient transactions.
Our model incorporates performance gain-based pricing, taking into account the revenue-based optimization objectives of both parties.
arXiv Detail & Related papers (2024-02-23T10:21:07Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - An Investigation of Smart Contract for Collaborative Machine Learning
Model Training [3.5679973993372642]
Collaborative machine learning (CML) has penetrated various fields in the era of big data.
As the training of ML models requires a massive amount of good quality data, it is necessary to eliminate concerns about data privacy.
Based on blockchain, smart contracts enable automatic execution of data preserving and validation.
arXiv Detail & Related papers (2022-09-12T04:25:01Z) - Dynamic and Context-Dependent Stock Price Prediction Using Attention
Modules and News Sentiment [0.0]
We propose a new modeling approach for financial time series data, the $alpha_t$-RIM (recurrent independent mechanism)
To model the data in such a dynamic manner, the $alpha_t$-RIM utilizes an exponentially smoothed recurrent neural network.
The results suggest that the $alpha_t$-RIM is capable of reflecting the causal structure between stock prices and news sentiment, as well as the seasonality and trends.
arXiv Detail & Related papers (2022-03-13T15:13:36Z) - A Marketplace for Trading AI Models based on Blockchain and Incentives
for IoT Data [24.847898465750667]
An emerging paradigm in Machine Learning (ML) is a federated approach where the learning model is delivered to a group of heterogeneous agents partially, allowing agents to train the model locally with their own data.
The problem of valuation of models, as well as the questions of incentives for collaborative training and trading of data/models, have received limited treatment in the literature.
In this paper, a new ecosystem of ML model trading over a trusted ML-based network is proposed. The buyer can acquire the model of interest from the ML market, and interested sellers spend local computations on their data to enhance that model's quality
arXiv Detail & Related papers (2021-12-06T08:52:42Z) - FL-Market: Trading Private Models in Federated Learning [19.11812014629085]
Existing model marketplaces assume that the broker can access data owners' private training data, which may not be realistic in practice.
We propose FL-Market, a locally private model marketplace that protects privacy not only against model buyers but also against the untrusted broker.
arXiv Detail & Related papers (2021-06-08T14:14:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.