Related papers: xDeepInt: a hybrid architecture for modeling the vector-wise and bit-wise feature interactions

xDeepInt: a hybrid architecture for modeling the vector-wise and bit-wise feature interactions

URL: http://arxiv.org/abs/2301.01089v1
Date: Tue, 3 Jan 2023 13:33:19 GMT
Title: xDeepInt: a hybrid architecture for modeling the vector-wise and bit-wise feature interactions
Authors: YaChen Yan, Liubo Li
Abstract summary: We propose a new model, xDeepInt, to balance the mixture of vector-wise and bit-wise feature interactions. Our experiment results demonstrate the efficacy and effectiveness of xDeepInt over state-of-the-art models.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning feature interactions is the key to success for the large-scale CTR prediction and recommendation. In practice, handcrafted feature engineering usually requires exhaustive searching. In order to reduce the high cost of human efforts in feature engineering, researchers propose several deep neural networks (DNN)-based approaches to learn the feature interactions in an end-to-end fashion. However, existing methods either do not learn both vector-wise interactions and bit-wise interactions simultaneously, or fail to combine them in a controllable manner. In this paper, we propose a new model, xDeepInt, based on a novel network architecture called polynomial interaction network (PIN) which learns higher-order vector-wise interactions recursively. By integrating subspace-crossing mechanism, we enable xDeepInt to balance the mixture of vector-wise and bit-wise feature interactions at a bounded order. Based on the network architecture, we customize a combined optimization strategy to conduct feature selection and interaction selection. We implement the proposed model and evaluate the model performance on three real-world datasets. Our experiment results demonstrate the efficacy and effectiveness of xDeepInt over state-of-the-art models. We open-source the TensorFlow implementation of xDeepInt: https://github.com/yanyachen/xDeepInt.

Related papers

NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals [58.83169560132308]
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks.
arXiv Detail & Related papers (2024-07-18T17:59:01Z)
Multilinear Operator Networks [60.7432588386185]
Polynomial Networks is a class of models that does not require activation functions. We propose MONet, which relies solely on multilinear operators.
arXiv Detail & Related papers (2024-01-31T16:52:19Z)
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction [0.0]
We propose AdaEnsemble: a Sparsely-Gated Mixture-of-Experts architecture that can leverage the strengths of heterogeneous feature interaction experts. AdaEnsemble can adaptively choose the feature interaction depth and find the corresponding SparseMoE stacking layer to exit and compute prediction from. We implement the proposed AdaEnsemble and evaluate its performance on real-world datasets.
arXiv Detail & Related papers (2023-01-06T12:08:15Z)
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically. Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z)
Sparse Interaction Additive Networks via Feature Interaction Detection and Sparse Selection [10.191597755296163]
We develop a tractable selection algorithm to efficiently identify the necessary feature combinations. Our proposed Sparse Interaction Additive Networks (SIAN) construct a bridge from simple and interpretable models to fully connected neural networks.
arXiv Detail & Related papers (2022-09-19T19:57:17Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Enhanced DeepONet for Modeling Partial Differential Operators Considering Multiple Input Functions [5.819397109258169]
A deep network operator (DeepONet) was proposed to model the general non-linear continuous operators for partial differential equations (PDE) Existing DeepONet can only accept one input function, which limits its application. We propose new Enhanced DeepONet or EDeepONet high-level neural network structure, in which two input functions are represented by two branch sub-networks. Our numerical results on modeling two partial differential equation examples shows that the proposed enhanced DeepONet is about 7X-17X or about one order of magnitude more accurate than the fully connected neural network.
arXiv Detail & Related papers (2022-02-17T23:58:23Z)
Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one. Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP. We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z)
Deep Imitation Learning for Bimanual Robotic Manipulation [70.56142804957187]
We present a deep imitation learning framework for robotic bimanual manipulation. A core challenge is to generalize the manipulation skills to objects in different locations. We propose to (i) decompose the multi-modal dynamics into elemental movement primitives, (ii) parameterize each primitive using a recurrent graph neural network to capture interactions, and (iii) integrate a high-level planner that composes primitives sequentially and a low-level controller to combine primitive dynamics and inverse kinematics control.
arXiv Detail & Related papers (2020-10-11T01:40:03Z)
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths. Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
Deep Learning with Functional Inputs [0.0]
We present a methodology for integrating functional data into feed-forward neural networks. A by-product of the method is a set of dynamic functional weights that can be visualized during the optimization process. The model is shown to perform well in a number of contexts including prediction of new data and recovery of the true underlying functional weights.
arXiv Detail & Related papers (2020-06-17T01:23:00Z)
Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet. Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs) Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.