Related papers: INFNet: A Task-aware Information Flow Network for Large-Scale Recommendation Systems

INFNet: A Task-aware Information Flow Network for Large-Scale Recommendation Systems

URL: http://arxiv.org/abs/2508.11565v1
Date: Fri, 15 Aug 2025 16:18:32 GMT
Title: INFNet: A Task-aware Information Flow Network for Large-Scale Recommendation Systems
Authors: Kaiyuan Li, Dongdong Mao, Yongxiang Tang, Yanhua Cheng, Yanxiang Zeng, Chao Wang, Xialong Liu, Peng Jiang,
Abstract summary: Information Flow Network (INFNet) is a task-aware architecture designed for large-scale recommendation scenarios.<n>INFNet distinguishes features into three token types, categorical tokens, sequence tokens, and task tokens, and introduces a novel dual-flow design.<n>INFNet has been successfully deployed in a commercial online advertising system, yielding significant gains of +1.587% in Revenue (REV) and +1.155% in Click-Through Rate (CTR)
Score: 8.283354901677692
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Feature interaction has long been a cornerstone of ranking models in large-scale recommender systems due to its proven effectiveness in capturing complex dependencies among features. However, existing feature interaction strategies face two critical challenges in industrial applications: (1) The vast number of categorical and sequential features makes exhaustive interaction computationally prohibitive, often resulting in optimization difficulties. (2) Real-world recommender systems typically involve multiple prediction objectives, yet most current approaches apply feature interaction modules prior to the multi-task learning layers. This late-fusion design overlooks task-specific feature dependencies and inherently limits the capacity of multi-task modeling. To address these limitations, we propose the Information Flow Network (INFNet), a task-aware architecture designed for large-scale recommendation scenarios. INFNet distinguishes features into three token types, categorical tokens, sequence tokens, and task tokens, and introduces a novel dual-flow design comprising heterogeneous and homogeneous alternating information blocks. For heterogeneous information flow, we employ a cross-attention mechanism with proxy that facilitates efficient cross-modal token interaction with balanced computational cost. For homogeneous flow, we design type-specific Proxy Gated Units (PGUs) to enable fine-grained intra-type feature processing. Extensive experiments on multiple offline benchmarks confirm that INFNet achieves state-of-the-art performance. Moreover, INFNet has been successfully deployed in a commercial online advertising system, yielding significant gains of +1.587% in Revenue (REV) and +1.155% in Click-Through Rate (CTR).

Related papers

Cross-Modal Attention Network with Dual Graph Learning in Multimodal Recommendation [12.802844514133255]
Cross-modal Recursive Attention Network with dual graph Embedding (CRANE)<n>We design a core Recursive Cross-Modal Attention (RCA) mechanism that iteratively refines modality features based on cross-correlations in a joint latent space.<n>For symmetric multimodal learning, we explicitly construct users' multimodal profiles by aggregating features of their interacted items.
arXiv Detail & Related papers (2026-01-16T10:09:39Z)
Action is All You Need: Dual-Flow Generative Ranking Network for Recommendation [25.30922374657862]
We propose a Dual-Flow Generative Ranking Network (DFGR) that employs a dual-flow mechanism to optimize interaction modeling.<n> DFGR duplicates the original user behavior sequence into a real flow and a fake flow based on the authenticity of the action information.<n>This design reduces computational overhead and improves both training efficiency and inference performance compared to Meta's HSTU-based model.
arXiv Detail & Related papers (2025-05-22T14:58:53Z)
Token Communication-Driven Multimodal Large Models in Resource-Constrained Multiuser Networks [7.137830911253685]
multimodal large models pose challenges for deploying intelligent applications at the wireless edge.<n>These constraints manifest as limited bandwidth, computational capacity, and stringent latency requirements.<n>We propose a token communication paradigm that facilitates decentralized proliferations across user devices and edge infrastructure.
arXiv Detail & Related papers (2025-05-06T14:17:05Z)
Quadratic Interest Network for Multimodal Click-Through Rate Prediction [12.989347150912685]
Multimodal click-through rate (CTR) prediction is a key technique in industrial recommender systems.<n>We propose a novel model for Task 2, named Quadratic Interest Network (QIN) for Multimodal CTR Prediction.
arXiv Detail & Related papers (2025-04-24T16:08:52Z)
MIETT: Multi-Instance Encrypted Traffic Transformer for Encrypted Traffic Classification [59.96233305733875]
Classifying traffic is essential for detecting security threats and optimizing network management.<n>We propose a Multi-Instance Encrypted Traffic Transformer (MIETT) to capture both token-level and packet-level relationships.<n>MIETT achieves results across five datasets, demonstrating its effectiveness in classifying encrypted traffic and understanding complex network behaviors.
arXiv Detail & Related papers (2024-12-19T12:52:53Z)
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.<n>DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.<n>Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z)
Bilateral Network with Residual U-blocks and Dual-Guided Attention for Real-time Semantic Segmentation [18.393208069320362]
We design a new fusion mechanism for two-branch architecture which is guided by attention computation. To be precise, we use the Dual-Guided Attention (DGA) module we proposed to replace some multi-scale transformations. Experiments on Cityscapes and CamVid dataset show the effectiveness of our method.
arXiv Detail & Related papers (2023-10-31T09:20:59Z)
Feature Decoupling-Recycling Network for Fast Interactive Segmentation [79.22497777645806]
Recent interactive segmentation methods iteratively take source image, user guidance and previously predicted mask as the input. We propose the Feature Decoupling-Recycling Network (FDRN), which decouples the modeling components based on their intrinsic discrepancies.
arXiv Detail & Related papers (2023-08-07T12:26:34Z)
Non-Separable Multi-Dimensional Network Flows for Visual Computing [62.50191141358778]
We propose a novel formalism for non-separable multi-dimensional network flows. Since the flow is defined on a per-dimension basis, the maximizing flow automatically chooses the best matching feature dimensions. As a proof of concept, we apply our formalism to the multi-object tracking problem and demonstrate that our approach outperforms scalar formulations on the MOT16 benchmark in terms of robustness to noise.
arXiv Detail & Related papers (2023-05-15T13:21:44Z)
HiNet: Novel Multi-Scenario & Multi-Task Learning with Hierarchical Information Extraction [50.40732146978222]
Multi-scenario & multi-task learning has been widely applied to many recommendation systems in industrial applications. We propose a Hierarchical information extraction Network (HiNet) for multi-scenario and multi-task recommendation. HiNet achieves a new state-of-the-art performance and significantly outperforms existing solutions.
arXiv Detail & Related papers (2023-03-10T17:24:41Z)
A Unified Object Motion and Affinity Model for Online Multi-Object Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA. UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning. We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.