SMAP: A Novel Heterogeneous Information Framework for Scenario-based
Optimal Model Assignment
- URL: http://arxiv.org/abs/2305.13634v1
- Date: Tue, 23 May 2023 03:01:26 GMT
- Title: SMAP: A Novel Heterogeneous Information Framework for Scenario-based
Optimal Model Assignment
- Authors: Zekun Qiu, Zhipu Xie, Zehua Ji, Yuhao Mao, Ke Cheng
- Abstract summary: A novel framework called Scenario and Model Associative percepts (SMAP) is developed.
SMAP can integrate various types of information to intelligently select a suitable dataset and allocate the optimal model for a specific scenario.
A novel memory mechanism named the mnemonic center is developed to store the matched heterogeneous information and prevent duplicate matching.
- Score: 5.834783927354705
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The increasing maturity of big data applications has led to a proliferation
of models targeting the same objectives within the same scenarios and datasets.
However, selecting the most suitable model that considers model's features
while taking specific requirements and constraints into account still poses a
significant challenge. Existing methods have focused on worker-task assignments
based on crowdsourcing, they neglect the scenario-dataset-model assignment
problem. To address this challenge, a new problem named the Scenario-based
Optimal Model Assignment (SOMA) problem is introduced and a novel framework
entitled Scenario and Model Associative percepts (SMAP) is developed. SMAP is a
heterogeneous information framework that can integrate various types of
information to intelligently select a suitable dataset and allocate the optimal
model for a specific scenario. To comprehensively evaluate models, a new score
function that utilizes multi-head attention mechanisms is proposed. Moreover, a
novel memory mechanism named the mnemonic center is developed to store the
matched heterogeneous information and prevent duplicate matching. Six popular
traffic scenarios are selected as study cases and extensive experiments are
conducted on a dataset to verify the effectiveness and efficiency of SMAP and
the score function.
Related papers
- Consensus-Driven Active Model Selection [29.150990754584978]
We propose a method for active model selection using predictions from candidate models to prioritize the labeling of test data points.<n>Our method, CODA, performs consensus-driven active model selection by modeling relationships between categories, and data points.<n>We validate our approach by curating a collection of 26 benchmark tasks capturing a range of model selection scenarios.
arXiv Detail & Related papers (2025-07-31T17:56:28Z) - Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent [74.02034188307857]
Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data.
We find existing methods inevitably discard task-specific information that, while causing conflicts, is crucial for performance.
Our approach consistently outperforms previous methods, achieving state-of-the-art results across diverse architectures and tasks in both vision and NLP domains.
arXiv Detail & Related papers (2025-01-02T12:45:21Z) - Exploring Query Efficient Data Generation towards Data-free Model Stealing in Hard Label Setting [38.755154033324374]
Data-free model stealing involves replicating the functionality of a target model into a substitute model without accessing the target model's structure, parameters, or training data.
This paper presents a new data-free model stealing approach called Query Efficient Data Generation (textbfQEDG)
We introduce two distinct loss functions to ensure the generation of sufficient samples that closely and uniformly align with the target model's decision boundary.
arXiv Detail & Related papers (2024-12-18T03:03:15Z) - An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences.
We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration.
Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z) - SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation [55.87169702896249]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift.
We propose a framework to evaluate DA methods and present a fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment.
Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z) - Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models [38.39395973523944]
We propose a three-stage scheme for data selection and review existing works according to this scheme.
We find that the more targeted method with data-specific and model-specific quality labels has higher efficiency.
arXiv Detail & Related papers (2024-06-20T08:58:58Z) - REFRESH: Responsible and Efficient Feature Reselection Guided by SHAP Values [17.489279048199304]
REFRESH is a method to reselect features so that additional constraints that are desirable towards model performance can be achieved without having to train several new models.
REFRESH's underlying algorithm is a novel technique using SHAP values and correlation analysis that can approximate for the predictions of a model without having to train these models.
arXiv Detail & Related papers (2024-03-13T18:06:43Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Budgeted Online Model Selection and Fine-Tuning via Federated Learning [26.823435733330705]
Online model selection involves selecting a model from a set of candidate models 'on the fly' to perform prediction on a stream of data.
The choice of candidate models henceforth has a crucial impact on the performance.
The present paper proposes an online federated model selection framework where a group of learners (clients) interacts with a server with sufficient memory.
Using the proposed algorithm, clients and the server collaborate to fine-tune models to adapt them to a non-stationary environment.
arXiv Detail & Related papers (2024-01-19T04:02:49Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Estimating Task Completion Times for Network Rollouts using Statistical
Models within Partitioning-based Regression Methods [0.01841601464419306]
This paper proposes a data and Machine Learning-based forecasting solution for the Telecommunications network-rollout planning problem.
Using historical data of milestone completion times, a model needs to incorporate domain knowledge, handle noise and yet be interpretable to project managers.
This paper proposes partition-based regression models that incorporate data-driven statistical models within each partition, as a solution to the problem.
arXiv Detail & Related papers (2022-11-20T04:28:12Z) - Adaptive Sampling Strategies to Construct Equitable Training Datasets [0.7036032466145111]
In domains ranging from computer vision to natural language processing, machine learning models have been shown to exhibit stark disparities.
One factor contributing to these performance gaps is a lack of representation in the data the models are trained on.
We formalize the problem of creating equitable training datasets, and propose a statistical framework for addressing this problem.
arXiv Detail & Related papers (2022-01-31T19:19:30Z) - Event Data Association via Robust Model Fitting for Event-based Object Tracking [66.05728523166755]
We propose a novel Event Data Association (called EDA) approach to explicitly address the event association and fusion problem.
The proposed EDA seeks for event trajectories that best fit the event data, in order to perform unifying data association and information fusion.
The experimental results show the effectiveness of EDA under challenging scenarios, such as high speed, motion blur, and high dynamic range conditions.
arXiv Detail & Related papers (2021-10-25T13:56:00Z) - Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data.
In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.