CTR-KAN: KAN for Adaptive High-Order Feature Interaction Modeling
- URL: http://arxiv.org/abs/2408.08713v4
- Date: Sat, 25 Jan 2025 03:14:35 GMT
- Title: CTR-KAN: KAN for Adaptive High-Order Feature Interaction Modeling
- Authors: Yunxiao Shi, Wujiang Xu, Haimin Zhang, Qiang Wu, Yongfeng Zhang, Min Xu,
- Abstract summary: CTR-KAN is an adaptive framework for efficient high-order feature interaction modeling.<n>It builds upon the Kolmogorov-Arnold Network (KAN) paradigm, addressing its limitations in CTR prediction tasks.<n>CTR-KAN achieves state-of-the-art predictive accuracy with significantly lower computational costs.
- Score: 37.80127625183842
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Modeling high-order feature interactions is critical for click-through rate (CTR) prediction, yet traditional approaches often face challenges in balancing predictive accuracy and computational efficiency. These methods typically rely on pre-defined interaction orders, which limit flexibility and require extensive prior knowledge. Moreover, explicitly modeling high-order interactions can lead to significant computational overhead. To tackle these challenges, we propose CTR-KAN, an adaptive framework for efficient high-order feature interaction modeling. CTR-KAN builds upon the Kolmogorov-Arnold Network (KAN) paradigm, addressing its limitations in CTR prediction tasks. Specifically, we introduce key enhancements, including a lightweight architecture that reduces the computational complexity of KAN and supports embedding-based feature representations. Additionally, CTR-KAN integrates guided symbolic regression to effectively capture multiplicative relationships, a known challenge in standard KAN implementations. Extensive experiments demonstrate that CTR-KAN achieves state-of-the-art predictive accuracy with significantly lower computational costs. Its sparse network structure also facilitates feature pruning and enhances global interpretability, making CTR-KAN a powerful tool for efficient inference in real-world CTR prediction scenarios.
Related papers
- Scaled Supervision is an Implicit Lipschitz Regularizer [32.41225209639384]
In social media, recommender systems rely on the click-through rate (CTR) as the standard metric to evaluate user engagement.
We show that scaling supervision bandwidth can act as an implicit Lipschitz regularizer, stably optimizing existing CTR models to achieve better generalizability.
arXiv Detail & Related papers (2025-03-19T01:01:28Z) - Context-Preserving Tensorial Reconfiguration in Large Language Model Training [0.0]
Context-Preservingial Reconfiguration (CPTR) enables dynamic complexity of weight tensors through structured factorization and adaptive contraction.
Empirical evaluations demonstrate that CPTR improves coherence retention across extended sequences.
Performance comparisons reveal that CPTR-enhanced models exhibit greater computational efficiency and reduced memory consumption.
arXiv Detail & Related papers (2025-02-01T00:55:19Z) - Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting [107.4034346788744]
Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions.
We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and fine-grained details; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation.
arXiv Detail & Related papers (2025-01-08T20:11:09Z) - An accuracy improving method for advertising click through rate prediction based on enhanced xDeepFM model [0.0]
This paper proposes an improved CTR prediction model based on the xDeepFM architecture.
By integrating a multi-head attention mechanism, the model can simultaneously focus on different aspects of feature interactions.
Experimental results on the Criteo dataset demonstrate that the proposed model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-11-21T03:21:29Z) - NeSHFS: Neighborhood Search with Heuristic-based Feature Selection for Click-Through Rate Prediction [1.3805049652130312]
Click-through-rate (CTR) prediction plays an important role in online advertising and ad recommender systems.
We propose a CTR algorithm named Neighborhood Search with Heuristic-based Feature Selection (NeSHFS) to enhance CTR prediction performance.
arXiv Detail & Related papers (2024-09-13T10:43:18Z) - ELASTIC: Efficient Linear Attention for Sequential Interest Compression [5.689306819772134]
State-of-the-art sequential recommendation models heavily rely on transformer's attention mechanism.
We propose ELASTIC, an Efficient Linear Attention for SequenTial Interest Compression.
We conduct extensive experiments on various public datasets and compare it with several strong sequential recommenders.
arXiv Detail & Related papers (2024-08-18T06:41:46Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - DELTA: Dynamic Embedding Learning with Truncated Conscious Attention for
CTR Prediction [61.68415731896613]
Click-Through Rate (CTR) prediction is a pivotal task in product and content recommendation.
We propose a model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction.
arXiv Detail & Related papers (2023-05-03T12:34:45Z) - Directed Acyclic Graph Factorization Machines for CTR Prediction via
Knowledge Distillation [65.62538699160085]
We propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation.
KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments.
arXiv Detail & Related papers (2022-11-21T03:09:42Z) - Contextual Squeeze-and-Excitation for Efficient Few-Shot Image
Classification [57.36281142038042]
We present a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance.
We also present a new training protocol based on Coordinate-Descent called UpperCaSE that exploits meta-trained CaSE blocks and fine-tuning routines for efficient adaptation.
arXiv Detail & Related papers (2022-06-20T15:25:08Z) - CascadER: Cross-Modal Cascading for Knowledge Graph Link Prediction [22.96768147978534]
We propose a tiered ranking architecture CascadER to maintain the ranking accuracy of full ensembling while improving efficiency considerably.
CascadER uses LMs to rerank the outputs of more efficient base KGEs, relying on an adaptive subset selection scheme aimed at invoking the LMs minimally while maximizing accuracy gain over the KGE.
Our empirical analyses reveal that diversity of models across modalities and preservation of individual models' confidence signals help explain the effectiveness of CascadER.
arXiv Detail & Related papers (2022-05-16T22:55:45Z) - Dynamic Parameterized Network for CTR Prediction [6.749659219776502]
We proposed a novel plug-in operation, Dynamic ized Operation (DPO), to learn both explicit and implicit interaction instance-wisely.
We showed that the introduction of DPO into DNN modules and Attention modules can respectively benefit two main tasks in click-through rate (CTR) prediction.
Our Dynamic ized Networks significantly outperforms state-of-the-art methods in the offline experiments on the public dataset and real-world production dataset.
arXiv Detail & Related papers (2021-11-09T08:15:03Z) - CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference.
We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms.
Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z) - AdnFM: An Attentive DenseNet based Factorization Machine for CTR
Prediction [11.958336595818267]
We propose a novel model called Attentive DenseNet based Factorization Machines (AdnFM)
AdnFM can extract more comprehensive deep features by using all the hidden layers from a feed-forward neural network as implicit high-order features.
Experiments on two real-world datasets show that the proposed model can effectively improve the performance of Click-Through-Rate prediction.
arXiv Detail & Related papers (2020-12-20T01:00:39Z) - DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator
Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead.
Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure.
Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z) - Towards Automated Neural Interaction Discovery for Click-Through Rate
Prediction [64.03526633651218]
Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems.
We propose an automated interaction architecture discovering framework for CTR prediction named AutoCTR.
arXiv Detail & Related papers (2020-06-29T04:33:01Z) - SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive
Connection [51.376723069962]
We present a method for accelerating and structuring self-attentions: Sparse Adaptive Connection.
In SAC, we regard the input sequence as a graph and attention operations are performed between linked nodes.
We show that SAC is competitive with state-of-the-art models while significantly reducing memory cost.
arXiv Detail & Related papers (2020-03-22T07:58:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.