Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model
- URL: http://arxiv.org/abs/2402.11948v1
- Date: Mon, 19 Feb 2024 08:43:00 GMT
- Title: Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model
- Authors: Jialiang Wang, Weiling Li, Yurong Zhong, Xin Luo
- Abstract summary: This paper proposes a miniblock diagonal hessian-free (Mini-Hes) optimization for building an LFA model.
Experiment results indicate that, with Mini-Hes, the LFA model outperforms several state-of-the-art models in addressing missing data estimation task.
- Score: 8.06111903129142
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interactions among large number of entities is naturally high-dimensional and
incomplete (HDI) in many big data related tasks. Behavioral characteristics of
users are hidden in these interactions, hence, effective representation of the
HDI data is a fundamental task for understanding user behaviors. Latent factor
analysis (LFA) model has proven to be effective in representing HDI data. The
performance of an LFA model relies heavily on its training process, which is a
non-convex optimization. It has been proven that incorporating local curvature
and preprocessing gradients during its training process can lead to superior
performance compared to LFA models built with first-order family methods.
However, with the escalation of data volume, the feasibility of second-order
algorithms encounters challenges. To address this pivotal issue, this paper
proposes a mini-block diagonal hessian-free (Mini-Hes) optimization for
building an LFA model. It leverages the dominant diagonal blocks in the
generalized Gauss-Newton matrix based on the analysis of the Hessian matrix of
LFA model and serves as an intermediary strategy bridging the gap between
first-order and second-order optimization methods. Experiment results indicate
that, with Mini-Hes, the LFA model outperforms several state-of-the-art models
in addressing missing data estimation task on multiple real HDI datasets from
recommender system. (The source code of Mini-Hes is available at
https://github.com/Goallow/Mini-Hes)
Related papers
- Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Fast Latent Factor Analysis via a Fuzzy PID-Incorporated Stochastic
Gradient Descent Algorithm [1.984879854062214]
A gradient descent (SGD)-based latent factor analysis model is remarkably effective in extracting valuable information from an HDI matrix.
A standard SGD algorithm learns a latent factor relying on the gradient of current instance error only without considering past update information.
This paper proposes a Fuzzy PID-incorporated SGD algorithm with two-fold ideas: 1) rebuilding the instance error by considering the past update information in an efficient way following the principle of PID, and 2) implementing hyper-learnings and gain adaptation following the fuzzy rules.
arXiv Detail & Related papers (2023-03-07T14:51:09Z) - uGLAD: Sparse graph recovery by optimizing deep unrolled networks [11.48281545083889]
We present a novel technique to perform sparse graph recovery by optimizing deep unrolled networks.
Our model, uGLAD, builds upon and extends the state-of-the-art model GLAD to the unsupervised setting.
We evaluate model results on synthetic Gaussian data, non-Gaussian data generated from Gene Regulatory Networks, and present a case study in anaerobic digestion.
arXiv Detail & Related papers (2022-05-23T20:20:27Z) - Graph-incorporated Latent Factor Analysis for High-dimensional and
Sparse Matrices [9.51012204233452]
A High-dimensional and sparse (HiDS) matrix is frequently encountered in a big data-related application like an e-commerce system or a social network services system.
This paper proposes a graph-incorporated latent factor analysis (GLFA) model to perform representation learning on HiDS matrix.
Experimental results on three real-world datasets demonstrate that GLFA outperforms six state-of-the-art models in predicting the missing data of an HiDS matrix.
arXiv Detail & Related papers (2022-04-16T15:04:34Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Revisiting Graph based Collaborative Filtering: A Linear Residual Graph
Convolutional Network Approach [55.44107800525776]
Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models.
In this paper, we revisit GCN based Collaborative Filtering (CF) based Recommender Systems (RS)
We show that removing non-linearities would enhance recommendation performance, consistent with the theories in simple graph convolutional networks.
We propose a residual network structure that is specifically designed for CF with user-item interaction modeling.
arXiv Detail & Related papers (2020-01-28T04:41:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.