Related papers: MatRec: Matrix Factorization for Highly Skewed Dataset

MatRec: Matrix Factorization for Highly Skewed Dataset

URL: http://arxiv.org/abs/2011.04395v1
Date: Mon, 9 Nov 2020 12:55:38 GMT
Title: MatRec: Matrix Factorization for Highly Skewed Dataset
Authors: Hao Wang, Bing Ruan
Abstract summary: We propose a new algorithm solving the problem in the framework of matrix factorization. We prove our method generates comparably favorite results with popular recommender system algorithms.
Score: 4.658166900129066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recommender systems is one of the most successful AI technologies applied in the internet cooperations. Popular internet products such as TikTok, Amazon, and YouTube have all integrated recommender systems as their core product feature. Although recommender systems have received great success, it is well known for highly skewed datasets, engineers and researchers need to adjust their methods to tackle the specific problem to yield good results. Inability to deal with highly skewed dataset usually generates hard computational problems for big data clusters and unsatisfactory results for customers. In this paper, we propose a new algorithm solving the problem in the framework of matrix factorization. We model the data skewness factors in the theoretic modeling of the approach with easy to interpret and easy to implement formulas. We prove in experiments our method generates comparably favorite results with popular recommender system algorithms such as Learning to Rank , Alternating Least Squares and Deep Matrix Factorization.

Related papers

Can Recommender Systems Teach Themselves? A Recursive Self-Improving Framework with Fidelity Control [82.30868101940068]
We propose a paradigm in which a model bootstraps its own performance without reliance on external data or teacher models.<n>Our theoretical analysis shows that RSIR acts as a data-driven implicit regularizer, smoothing the optimization landscape.<n>We show that even smaller models benefit, and weak models can generate effective training curricula for stronger ones.
arXiv Detail & Related papers (2026-02-17T15:31:32Z)
Does Weighting Improve Matrix Factorization for Recommender Systems? [36.1332376112504]
Matrix factorization is a widely used approach for top-N recommendation and collaborative filtering.<n>In this paper, we conduct a systematic study of various weighting schemes and matrix factorization algorithms.<n>We find that training with unweighted data can perform comparably to, and sometimes outperform, training with weighted data, especially for large models.
arXiv Detail & Related papers (2025-10-12T04:15:24Z)
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling [62.19438812624467]
Large language models (LLMs) have exhibited their problem-solving abilities in mathematical reasoning. We propose OptiBench, a benchmark for End-to-end optimization problem-solving with human-readable inputs and outputs.
arXiv Detail & Related papers (2024-07-13T13:27:57Z)
CF Recommender System Based on Ontology and Nonnegative Matrix Factorization (NMF) [0.0]
This work is to address the recommender system's data sparsity and accuracy problems. The implemented approach efficiently reduces the sparsity of CF suggestions, improves their accuracy, and gives more relevant items as recommendations.
arXiv Detail & Related papers (2024-05-31T14:50:53Z)
Embedding in Recommender Systems: A Survey [54.55152033023537]
This survey presents a comprehensive analysis of advances in recommender system embedding techniques.<n>In matrix-based scenarios, collaborative filtering generates embeddings that effectively model user-item preferences.<n>We introduce emerging approaches, including AutoML, hashing techniques, and quantization methods, to enhance performance.
arXiv Detail & Related papers (2023-10-28T06:31:06Z)
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems. We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z)
On the Generalizability and Predictability of Recommender Systems [33.46314108814183]
We give the first large-scale study of recommender system approaches. We create Reczilla, a meta-learning approach to recommender systems.
arXiv Detail & Related papers (2022-06-23T17:51:42Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Ranking Cost: Building An Efficient and Scalable Circuit Routing Planner with Evolution-Based Optimization [49.207538634692916]
We propose a new algorithm for circuit routing, named Ranking Cost, to form an efficient and trainable router. In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths. Our algorithm is trained in an end-to-end manner and does not use any artificial data or human demonstration.
arXiv Detail & Related papers (2021-10-08T07:22:45Z)
Solving weakly supervised regression problem using low-rank manifold regularization [77.34726150561087]
We solve a weakly supervised regression problem. Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources. In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z)
You Only Compress Once: Optimal Data Compression for Estimating Linear Models [1.2845031126178592]
Many engineering systems that use linear models achieve computational efficiency through distributed systems and expert configuration. Conditionally sufficient statistics is a unified data compression and estimation strategy.
arXiv Detail & Related papers (2021-02-22T19:00:18Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Binary Matrix Factorisation via Column Generation [6.445605125467574]
We consider the problem of low-rank binary matrix factorisation (BMF) under arithmetic. We formulate the problem as a mixed integer linear program and use a large scale optimisation technique of column generation to solve it. Experimental results on real world datasets demonstrate that our proposed method is effective at producing highly accurate factorisations.
arXiv Detail & Related papers (2020-11-09T14:27:36Z)
Recommendation system using a deep learning and graph analysis approach [1.2183405753834562]
We propose a novel recommendation method based on Matrix Factorization and graph analysis methods. In addition, we leverage deep Autoencoders to initialize users and items latent factors, and deep embedding method gathers users' latent factors from the user trust graph.
arXiv Detail & Related papers (2020-04-17T08:05:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.