Related papers: Infinite Recommendation Networks: A Data-Centric Approach

Infinite Recommendation Networks: A Data-Centric Approach

URL: http://arxiv.org/abs/2206.02626v1
Date: Fri, 3 Jun 2022 00:34:13 GMT
Title: Infinite Recommendation Networks: A Data-Centric Approach
Authors: Noveen Sachdeva, Mehak Preet Dhaliwal, Carole-Jean Wu, Julian McAuley
Abstract summary: We leverage the Neural Tangent Kernel to train infinitely-wide neural networks to devise $infty$-AE: an autoencoder with infinitely-wide bottleneck layers. We also develop Distill-CF for synthesizing tiny, high-fidelity data summaries. We observe 96-105% of $infty$-AE's performance on the full dataset with as little as 0.1% of the original dataset size.
Score: 8.044430277912936
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We leverage the Neural Tangent Kernel and its equivalence to training infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with infinitely-wide bottleneck layers. The outcome is a highly expressive yet simplistic recommendation model with a single hyper-parameter and a closed-form solution. Leveraging $\infty$-AE's simplicity, we also develop Distill-CF for synthesizing tiny, high-fidelity data summaries which distill the most important knowledge from the extremely large and sparse user-item interaction matrix for efficient and accurate subsequent data-usage like model training, inference, architecture search, etc. This takes a data-centric approach to recommendation, where we aim to improve the quality of logged user-feedback data for subsequent modeling, independent of the learning algorithm. We particularly utilize the concept of differentiable Gumbel-sampling to handle the inherent data heterogeneity, sparsity, and semi-structuredness, while being scalable to datasets with hundreds of millions of user-item interactions. Both of our proposed approaches significantly outperform their respective state-of-the-art and when used together, we observe 96-105% of $\infty$-AE's performance on the full dataset with as little as 0.1% of the original dataset size, leading us to explore the counter-intuitive question: Is more data what you need for better recommendation?

Related papers

Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime [9.749891245059596]
We demonstrate that selecting more uniformly distributed data can improve training efficiency while enhancing performance.<n>Specifically, we establish that more uniform (less biased) distribution leads to a larger minimum pairwise distance between data points.<n>We theoretically show that the approximation error of neural networks decreases as $h_min$ increases.
arXiv Detail & Related papers (2025-06-30T17:58:30Z)
Collaborative Unlabeled Data Optimization [6.512302544770766]
This paper pioneers a novel data-centric paradigm to maximize the utility of unlabeled data.<n>By distributing unlabeled data and leveraging publicly available task-agnostic models, CoOpt facilitates scalable, reusable, and sustainable training pipelines.
arXiv Detail & Related papers (2025-05-20T09:21:40Z)
Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness [65.01625761120924]
We argue that a valuable sample should be informative of the task, non-redundant, and represent the sample distribution (i.e., not an outlier) We propose a collaborative framework, DataTailor, which leverages three key principles--informativeness, uniqueness, and representativeness--for effective data selection. Experiments on various benchmarks demonstrate that DataTailor achieves 100.8% of the performance of full-data fine-tuning with only 15% of the data.
arXiv Detail & Related papers (2024-12-09T08:36:10Z)
Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning [19.962212551963383]
Active Learning (AL) allows models to learn interactively from user feedback. This paper introduces a counterfactual data augmentation approach to AL.
arXiv Detail & Related papers (2024-08-07T14:55:04Z)
Data curation via joint example selection further accelerates multimodal learning [3.329535792151987]
We show that jointly selecting batches of data is more effective for learning than selecting examples independently. We derive a simple and tractable algorithm for selecting such batches, which significantly accelerate training beyond individually-prioritized data points.
arXiv Detail & Related papers (2024-06-25T16:52:37Z)
Generative Expansion of Small Datasets: An Expansive Graph Approach [13.053285552524052]
We introduce an Expansive Synthesis model generating large-scale, information-rich datasets from minimal samples. An autoencoder with self-attention layers and optimal transport refines distributional consistency. Results show comparable performance, demonstrating the model's potential to augment training data effectively.
arXiv Detail & Related papers (2024-06-25T02:59:02Z)
Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching. Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights. Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs. In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z)
Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data? [6.0945220518329855]
This paper investigates the idea of combining multiple trained neural networks using unlabeled data. To this end, the proposed method makes use of generation, filtering, and aggregation of reliable pseudo-labels collected from unlabeled data. Our method supports using an arbitrary number of input models with arbitrary architectures and categories.
arXiv Detail & Related papers (2021-10-20T04:17:25Z)
Learning a Self-Expressive Network for Subspace Clustering [15.096251922264281]
We propose a novel framework for subspace clustering, termed Self-Expressive Network (SENet), which employs a properly designed neural network to learn a self-expressive representation of the data. Our SENet can not only learn the self-expressive coefficients with desired properties on the training data, but also handle out-of-sample data. In particular, SENet yields highly competitive performance on MNIST, Fashion MNIST and Extended MNIST and state-of-the-art performance on CIFAR-10.
arXiv Detail & Related papers (2021-10-08T18:06:06Z)
Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one. Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP. We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z)
S^3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization [104.87483578308526]
We propose the model S3-Rec, which stands for Self-Supervised learning for Sequential Recommendation. For our task, we devise four auxiliary self-supervised objectives to learn the correlations among attribute, item, subsequence, and sequence. Extensive experiments conducted on six real-world datasets demonstrate the superiority of our proposed method over existing state-of-the-art methods.
arXiv Detail & Related papers (2020-08-18T11:44:10Z)
AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective. We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.