EMOFM: Ensemble MLP mOdel with Feature-based Mixers for Click-Through
Rate Prediction
- URL: http://arxiv.org/abs/2310.04482v2
- Date: Sun, 15 Oct 2023 10:49:13 GMT
- Title: EMOFM: Ensemble MLP mOdel with Feature-based Mixers for Click-Through
Rate Prediction
- Authors: Yujian Betterest Li, Kai Wu
- Abstract summary: A dataset contains millions of records and each field-wise feature in a record consists of hashed integers for privacy.
For this task, the keys of network-based methods might be type-wise feature extraction and information fusion across different fields.
We propose plug-in mixers for field/type-wise feature fusion, thus construct an field&type-wise ensemble model, namely EMOFM.
- Score: 5.983194751474721
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Track one of CTI competition is on click-through rate (CTR) prediction. The
dataset contains millions of records and each field-wise feature in a record
consists of hashed integers for privacy. For this task, the keys of
network-based methods might be type-wise feature extraction and information
fusion across different fields. Multi-layer perceptrons (MLPs) are able to
extract field feature, but could not efficiently fuse features. Motivated by
the natural fusion characteristic of cross attention and the efficiency of
transformer-based structures, we propose simple plug-in mixers for
field/type-wise feature fusion, and thus construct an field&type-wise ensemble
model, namely EMOFM (Ensemble MLP mOdel with Feature-based Mixers). In the
experiments, the proposed model is evaluated on the dataset, the optimization
process is visualized and ablation studies are explored. It is shown that EMOFM
outperforms compared baselines. In the end, we discuss on future work. WARNING:
The comparison might not be fair enough since the proposed method is designed
for this data in particular while compared methods are not. For example, EMOFM
especially takes different types of interactions into consideration while
others do not. Anyway, we do hope that the ideas inside our method could help
other developers/learners/researchers/thinkers and so on.
Related papers
- Task-customized Masked AutoEncoder via Mixture of Cluster-conditional
Experts [104.9871176044644]
Masked Autoencoder(MAE) is a prevailing self-supervised learning method that achieves promising results in model pre-training.
We propose a novel MAE-based pre-training paradigm, Mixture of Cluster-conditional Experts (MoCE)
MoCE trains each expert only with semantically relevant images by using cluster-conditional gates.
arXiv Detail & Related papers (2024-02-08T03:46:32Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Learning with MISELBO: The Mixture Cookbook [62.75516608080322]
We present the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network.
We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling.
We obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets.
arXiv Detail & Related papers (2022-09-30T15:01:35Z) - Boosting Factorization Machines via Saliency-Guided Mixup [125.15872106335692]
We present MixFM, inspired by Mixup, to generate auxiliary training data to boost Factorization machines (FMs)
We also put forward a novel Factorization Machine powered by Saliency-guided Mixup (denoted as SMFM)
arXiv Detail & Related papers (2022-06-17T09:49:00Z) - Making a (Counterfactual) Difference One Rationale at a Time [5.97507595130844]
We investigate whether counterfactual data augmentation, without human assistance, can improve the performance of the selector.
Our results show that CDA produces rationales that better capture the signal of interest.
arXiv Detail & Related papers (2022-01-13T19:05:02Z) - Data Fusion with Latent Map Gaussian Processes [0.0]
Multi-fidelity modeling and calibration are data fusion tasks that ubiquitously arise in engineering design.
We introduce a novel approach based on latent-map Gaussian processes (LMGPs) that enables efficient and accurate data fusion.
arXiv Detail & Related papers (2021-12-04T00:54:19Z) - Noisy Feature Mixup [42.056684988818766]
We introduce Noisy Feature Mixup (NFM), an inexpensive yet effective method for data augmentation.
NFM includes mixup and manifold mixup as special cases, but it has additional advantages, including better smoothing of decision boundaries.
We show that residual networks and vision transformers trained with NFM have favorable trade-offs between predictive accuracy on clean data and robustness with respect to various types of data.
arXiv Detail & Related papers (2021-10-05T17:13:51Z) - Efficient Data-specific Model Search for Collaborative Filtering [56.60519991956558]
Collaborative filtering (CF) is a fundamental approach for recommender systems.
In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model.
Key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction and prediction function.
arXiv Detail & Related papers (2021-06-14T14:30:32Z) - VMLoc: Variational Fusion For Learning-Based Multimodal Camera
Localization [46.607930208613574]
We propose an end-to-end framework, termed VMLoc, to fuse different sensor inputs into a common latent space.
Unlike previous multimodal variational works directly adapting the objective function of vanilla variational auto-encoder, we show how camera localization can be accurately estimated.
arXiv Detail & Related papers (2020-03-12T14:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.