AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for
Click-Through Rate Prediction
- URL: http://arxiv.org/abs/2301.08353v1
- Date: Fri, 6 Jan 2023 12:08:15 GMT
- Title: AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for
Click-Through Rate Prediction
- Authors: YaChen Yan, Liubo Li
- Abstract summary: We propose AdaEnsemble: a Sparsely-Gated Mixture-of-Experts architecture that can leverage the strengths of heterogeneous feature interaction experts.
AdaEnsemble can adaptively choose the feature interaction depth and find the corresponding SparseMoE stacking layer to exit and compute prediction from.
We implement the proposed AdaEnsemble and evaluate its performance on real-world datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning feature interactions is crucial to success for large-scale CTR
prediction in recommender systems and Ads ranking. Researchers and
practitioners extensively proposed various neural network architectures for
searching and modeling feature interactions. However, we observe that different
datasets favor different neural network architectures and feature interaction
types, suggesting that different feature interaction learning methods may have
their own unique advantages. Inspired by this observation, we propose
AdaEnsemble: a Sparsely-Gated Mixture-of-Experts (SparseMoE) architecture that
can leverage the strengths of heterogeneous feature interaction experts and
adaptively learns the routing to a sparse combination of experts for each
example, allowing us to build a dynamic hierarchy of the feature interactions
of different types and orders. To further improve the prediction accuracy and
inference efficiency, we incorporate the dynamic early exiting mechanism for
feature interaction depth selection. The AdaEnsemble can adaptively choose the
feature interaction depth and find the corresponding SparseMoE stacking layer
to exit and compute prediction from. Therefore, our proposed architecture
inherits the advantages of the exponential combinations of sparsely gated
experts within SparseMoE layers and further dynamically selects the optimal
feature interaction depth without executing deeper layers. We implement the
proposed AdaEnsemble and evaluate its performance on real-world datasets.
Extensive experiment results demonstrate the efficiency and effectiveness of
AdaEnsemble over state-of-the-art models.
Related papers
- GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Adaptive Ensemble Learning: Boosting Model Performance through
Intelligent Feature Fusion in Deep Neural Networks [0.0]
We present an Adaptive Ensemble Learning framework that aims to boost the performance of deep neural networks.
The framework integrates ensemble learning strategies with deep learning architectures to create a more robust and adaptable model.
By leveraging intelligent feature fusion methods, the framework generates more discriminative and effective feature representations.
arXiv Detail & Related papers (2023-04-04T21:49:49Z) - xDeepInt: a hybrid architecture for modeling the vector-wise and
bit-wise feature interactions [0.0]
We propose a new model, xDeepInt, to balance the mixture of vector-wise and bit-wise feature interactions.
Our experiment results demonstrate the efficacy and effectiveness of xDeepInt over state-of-the-art models.
arXiv Detail & Related papers (2023-01-03T13:33:19Z) - HINNPerf: Hierarchical Interaction Neural Network for Performance
Prediction of Configurable Systems [22.380061796355616]
HINNPerf is a novel hierarchical interaction neural network for performance prediction.
HINNPerf employs the embedding method and hierarchic network blocks to model the complicated interplay between configuration options.
Empirical results on 10 real-world systems show that our method statistically significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-04-08T08:52:23Z) - DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale
Click-Through Rate Prediction [20.51885543358098]
We propose DHEN - a deep and hierarchical ensemble architecture that can leverage strengths of heterogeneous interaction modules and learn a hierarchy of the interactions under different orders.
Experiments on large-scale dataset from CTR prediction tasks attained 0.27% improvement on the Normalized Entropy of prediction and 1.2x better training throughput than state-of-the-art baseline.
arXiv Detail & Related papers (2022-03-11T21:19:31Z) - Pareto-wise Ranking Classifier for Multi-objective Evolutionary Neural
Architecture Search [15.454709248397208]
This study focuses on how to find feasible deep models under diverse design objectives.
We propose a classification-wise Pareto evolution approach for one-shot NAS, where an online classifier is trained to predict the dominance relationship between the candidate and constructed reference architectures.
We find a number of neural architectures with different model sizes ranging from 2M to 6M under diverse objectives and constraints.
arXiv Detail & Related papers (2021-09-14T13:28:07Z) - Redefining Neural Architecture Search of Heterogeneous Multi-Network
Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models.
We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z) - FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data [106.76845921324704]
We propose a novel method named Feature Interaction Via Edge Search (FIVES)
FIVES formulates the task of interactive feature generation as searching for edges on the defined feature graph.
In this paper, we present our theoretical evidence that motivates us to search for useful interactive features with increasing order.
arXiv Detail & Related papers (2020-07-29T03:33:18Z) - Towards Automated Neural Interaction Discovery for Click-Through Rate
Prediction [64.03526633651218]
Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems.
We propose an automated interaction architecture discovering framework for CTR prediction named AutoCTR.
arXiv Detail & Related papers (2020-06-29T04:33:01Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - A Dependency Syntactic Knowledge Augmented Interactive Architecture for
End-to-End Aspect-based Sentiment Analysis [73.74885246830611]
We propose a novel dependency syntactic knowledge augmented interactive architecture with multi-task learning for end-to-end ABSA.
This model is capable of fully exploiting the syntactic knowledge (dependency relations and types) by leveraging a well-designed Dependency Relation Embedded Graph Convolutional Network (DreGcn)
Extensive experimental results on three benchmark datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-04T14:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.