Infinite Recommendation Networks: A Data-Centric Approach
- URL: http://arxiv.org/abs/2206.02626v1
- Date: Fri, 3 Jun 2022 00:34:13 GMT
- Title: Infinite Recommendation Networks: A Data-Centric Approach
- Authors: Noveen Sachdeva, Mehak Preet Dhaliwal, Carole-Jean Wu, Julian McAuley
- Abstract summary: We leverage the Neural Tangent Kernel to train infinitely-wide neural networks to devise $infty$-AE: an autoencoder with infinitely-wide bottleneck layers.
We also develop Distill-CF for synthesizing tiny, high-fidelity data summaries.
We observe 96-105% of $infty$-AE's performance on the full dataset with as little as 0.1% of the original dataset size.
- Score: 8.044430277912936
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We leverage the Neural Tangent Kernel and its equivalence to training
infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with
infinitely-wide bottleneck layers. The outcome is a highly expressive yet
simplistic recommendation model with a single hyper-parameter and a closed-form
solution. Leveraging $\infty$-AE's simplicity, we also develop Distill-CF for
synthesizing tiny, high-fidelity data summaries which distill the most
important knowledge from the extremely large and sparse user-item interaction
matrix for efficient and accurate subsequent data-usage like model training,
inference, architecture search, etc. This takes a data-centric approach to
recommendation, where we aim to improve the quality of logged user-feedback
data for subsequent modeling, independent of the learning algorithm. We
particularly utilize the concept of differentiable Gumbel-sampling to handle
the inherent data heterogeneity, sparsity, and semi-structuredness, while being
scalable to datasets with hundreds of millions of user-item interactions. Both
of our proposed approaches significantly outperform their respective
state-of-the-art and when used together, we observe 96-105% of $\infty$-AE's
performance on the full dataset with as little as 0.1% of the original dataset
size, leading us to explore the counter-intuitive question: Is more data what
you need for better recommendation?
Related papers
- Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning [19.962212551963383]
Active Learning (AL) allows models to learn interactively from user feedback.
This paper introduces a counterfactual data augmentation approach to AL.
arXiv Detail & Related papers (2024-08-07T14:55:04Z) - Data curation via joint example selection further accelerates multimodal learning [3.329535792151987]
We show that jointly selecting batches of data is more effective for learning than selecting examples independently.
We derive a simple and tractable algorithm for selecting such batches, which significantly accelerate training beyond individually-prioritized data points.
arXiv Detail & Related papers (2024-06-25T16:52:37Z) - Generative Expansion of Small Datasets: An Expansive Graph Approach [13.053285552524052]
We introduce an Expansive Synthesis model generating large-scale, information-rich datasets from minimal samples.
An autoencoder with self-attention layers and optimal transport refines distributional consistency.
Results show comparable performance, demonstrating the model's potential to augment training data effectively.
arXiv Detail & Related papers (2024-06-25T02:59:02Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z) - Model Composition: Can Multiple Neural Networks Be Combined into a
Single Network Using Only Unlabeled Data? [6.0945220518329855]
This paper investigates the idea of combining multiple trained neural networks using unlabeled data.
To this end, the proposed method makes use of generation, filtering, and aggregation of reliable pseudo-labels collected from unlabeled data.
Our method supports using an arbitrary number of input models with arbitrary architectures and categories.
arXiv Detail & Related papers (2021-10-20T04:17:25Z) - Learning a Self-Expressive Network for Subspace Clustering [15.096251922264281]
We propose a novel framework for subspace clustering, termed Self-Expressive Network (SENet), which employs a properly designed neural network to learn a self-expressive representation of the data.
Our SENet can not only learn the self-expressive coefficients with desired properties on the training data, but also handle out-of-sample data.
In particular, SENet yields highly competitive performance on MNIST, Fashion MNIST and Extended MNIST and state-of-the-art performance on CIFAR-10.
arXiv Detail & Related papers (2021-10-08T18:06:06Z) - Solving Mixed Integer Programs Using Neural Networks [57.683491412480635]
This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one.
Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP.
We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each.
arXiv Detail & Related papers (2020-12-23T09:33:11Z) - S^3-Rec: Self-Supervised Learning for Sequential Recommendation with
Mutual Information Maximization [104.87483578308526]
We propose the model S3-Rec, which stands for Self-Supervised learning for Sequential Recommendation.
For our task, we devise four auxiliary self-supervised objectives to learn the correlations among attribute, item, subsequence, and sequence.
Extensive experiments conducted on six real-world datasets demonstrate the superiority of our proposed method over existing state-of-the-art methods.
arXiv Detail & Related papers (2020-08-18T11:44:10Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.