Related papers: Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation

Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation

URL: http://arxiv.org/abs/2404.04265v1
Date: Mon, 18 Mar 2024 16:27:33 GMT
Title: Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation
Authors: Yining Wu, Shengyu Duan, Gaole Sai, Chenhong Cao, Guobing Zou,
Abstract summary: Matrix factorization (MF) is a widely used collaborative filtering algorithm for recommendation systems (RSs) With the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. We propose algorithmic methods to accelerate MF, without inducing any additional computational resources.
Score: 0.49399484784577985
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Matrix factorization (MF) is a widely used collaborative filtering (CF) algorithm for recommendation systems (RSs), due to its high prediction accuracy, great flexibility and high efficiency in big data processing. However, with the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. Many existing works have accelerated MF, by either putting in additional computational resources or utilizing parallel systems, introducing a large cost. In this paper, we propose algorithmic methods to accelerate MF, without inducing any additional computational resources. In specific, we observe fine-grained structured sparsity in the decomposed feature matrices when considering a certain threshold. The fine-grained structured sparsity causes a large amount of unnecessary operations during both matrix multiplication and latent factor update, increasing the computational time of the MF training process. Based on the observation, we firstly propose to rearrange the feature matrices based on joint sparsity, which potentially makes a latent vector with a smaller index more dense than that with a larger index. The feature matrix rearrangement is given to limit the error caused by the later performed pruning process. We then propose to prune the insignificant latent factors by an early stopping process during both matrix multiplication and latent factor update. The pruning process is dynamically performed according to the sparsity of the latent factors for different users/items, to accelerate the process. The experiments show that our method can achieve 1.2-1.65 speedups, with up to 20.08% error increase, compared with the conventional MF training process. We also prove the proposed methods are applicable considering different hyperparameters including optimizer, optimization strategy and initialization method.

Related papers

Orthogonal Finetuning Made Scalable [87.49040247077389]
Orthogonal finetuning (OFT) offers highly parameter-efficient adaptation while preventing catastrophic forgetting, but its high runtime and memory demands limit practical deployment.<n>We identify the core computational bottleneck in OFT as its weight-centric implementation, which relies on costly matrix-matrix multiplications with cubic complexity.<n>We propose OFTv2, an input-centric reformulation that instead uses matrix-vector multiplications (i.e., matrix-free computation), reducing the computational cost to quadratic.<n>These modifications allow OFTv2 to achieve up to 10x faster training and 3x lower GPU memory usage without compromising performance.
arXiv Detail & Related papers (2025-06-24T17:59:49Z)
Preconditioned Additive Gaussian Processes with Fourier Acceleration [2.292881746604941]
We introduce a matrix-free method to achieve nearly linear complexity in the multiplication of kernel matrices and their derivatives. To address high-dimensional problems, we propose an additive kernel approach. Each sub- Kernel captures lower-order feature interactions, allowing for the efficient application of the NFFT method.
arXiv Detail & Related papers (2025-04-01T07:14:06Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
A Second-Order Majorant Algorithm for Nonnegative Matrix Factorization [2.646309221150203]
We introduce a general second-order optimization framework for NMF under both quadratic and $beta$-divergence loss functions.<n>Second-Order Majorant (SOM) constructs a local quadratic majorization of the loss function by majorizing its Hessian matrix.<n>We show that mSOM consistently outperforms state-of-the-art algorithms across multiple loss functions.
arXiv Detail & Related papers (2023-03-31T12:09:36Z)
Unitary Approximate Message Passing for Matrix Factorization [90.84906091118084]
We consider matrix factorization (MF) with certain constraints, which finds wide applications in various areas. We develop a Bayesian approach to MF with an efficient message passing implementation, called UAMPMF. We show that UAMPMF significantly outperforms state-of-the-art algorithms in terms of recovery accuracy, robustness and computational complexity.
arXiv Detail & Related papers (2022-07-31T12:09:32Z)
Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue. We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues. Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z)
FastSTMF: Efficient tropical matrix factorization algorithm for sparse data [0.0]
Matrix factorization, one of the most popular methods in machine learning, has recently benefited from introducing non-linearity in prediction tasks using tropical semiring. In our work, we propose a new method FastSTMF based on Sparse Tropical Matrix Factorization (STMF) We evaluate FastSTMF on synthetic and real gene expression data from the TCGA database, and the results show that FastSTMF outperforms STMF in both accuracy and running time.
arXiv Detail & Related papers (2022-05-13T13:13:06Z)
Weighted Low Rank Matrix Approximation and Acceleration [0.5177947445379687]
Low-rank matrix approximation is one of the central concepts in machine learning. Low-rank matrix completion (LRMC) solves the LRMA problem when some observations are missing. We propose an algorithm for solving the weighted problem, as well as two acceleration techniques.
arXiv Detail & Related papers (2021-09-22T22:03:48Z)
Self-supervised Symmetric Nonnegative Matrix Factorization [82.59905231819685]
Symmetric nonnegative factor matrix (SNMF) has demonstrated to be a powerful method for data clustering. Inspired by ensemble clustering that aims to seek better clustering results, we propose self-supervised SNMF (S$3$NMF) We take advantage of the sensitivity to code characteristic of SNMF, without relying on any additional information.
arXiv Detail & Related papers (2021-03-02T12:47:40Z)
Fast and Accurate Pseudoinverse with Sparse Matrix Reordering and Incremental Approach [4.710916891482697]
A pseudoinverse is a generalization of a matrix inverse, which has been extensively utilized in machine learning. FastPI is a novel incremental singular value decomposition (SVD) based pseudoinverse method for sparse matrices. We show that FastPI computes the pseudoinverse faster than other approximate methods without loss of accuracy.
arXiv Detail & Related papers (2020-11-09T07:47:10Z)
Rank and run-time aware compression of NLP Applications [12.965657113072325]
This paper proposes a new compression technique called Hybrid Matrix Factorization. It improves low-rank matrix factorization techniques by doubling the rank of the matrix. It can achieve more than 2.32x faster inference run-time than pruning and 16.77% better accuracy than LMF.
arXiv Detail & Related papers (2020-10-06T16:03:15Z)
Augmentation of the Reconstruction Performance of Fuzzy C-Means with an Optimized Fuzzification Factor Vector [99.19847674810079]
Fuzzy C-Means (FCM) is one of the most frequently used methods to construct information granules. In this paper, we augment the FCM-based degranulation mechanism by introducing a vector of fuzzification factors. Experiments completed for both synthetic and publicly available datasets show that the proposed approach outperforms the generic data reconstruction approach.
arXiv Detail & Related papers (2020-04-13T04:17:30Z)
A High-Performance Implementation of Bayesian Matrix Factorization with Limited Communication [10.639704288188767]
Matrix factorization algorithms can quantify uncertainty in their predictions and avoid over-fitting. They have not been widely used on large-scale data because of their prohibitive computational cost. We show that the state-of-the-art of both approaches to scalability can be combined.
arXiv Detail & Related papers (2020-04-06T11:16:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.