Related papers: Binary Matrix Factorisation via Column Generation

Binary Matrix Factorisation via Column Generation

URL: http://arxiv.org/abs/2011.04457v3
Date: Tue, 3 Aug 2021 20:49:16 GMT
Title: Binary Matrix Factorisation via Column Generation
Authors: Reka A. Kovacs, Oktay Gunluk, Raphael A. Hauser
Abstract summary: We consider the problem of low-rank binary matrix factorisation (BMF) under arithmetic. We formulate the problem as a mixed integer linear program and use a large scale optimisation technique of column generation to solve it. Experimental results on real world datasets demonstrate that our proposed method is effective at producing highly accurate factorisations.
Score: 6.445605125467574
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Identifying discrete patterns in binary data is an important dimensionality reduction tool in machine learning and data mining. In this paper, we consider the problem of low-rank binary matrix factorisation (BMF) under Boolean arithmetic. Due to the hardness of this problem, most previous attempts rely on heuristic techniques. We formulate the problem as a mixed integer linear program and use a large scale optimisation technique of column generation to solve it without the need of heuristic pattern mining. Our approach focuses on accuracy and on the provision of optimality guarantees. Experimental results on real world datasets demonstrate that our proposed method is effective at producing highly accurate factorisations and improves on the previously available best known results for 15 out of 24 problem instances.

Related papers

Efficient algorithms for the Hadamard decomposition [14.653207365119796]
The Hadamard decomposition is a powerful technique for data analysis and matrix compression. In this paper, we develop a framework that decomposes a given matrix into the product of two or more low-rank approximations. We conduct experiments to compare our method with the existing descent-based approaches for the Hadamard decomposition.
arXiv Detail & Related papers (2025-04-18T11:14:25Z)
Computational Efficient Informative Nonignorable Matrix Completion: A Row- and Column-Wise Matrix U-Statistic Pseudo-Likelihood Approach [2.2306682526405868]
We establish a unified framework to deal with the high dimensional matrix completion problem. We derive a row- and column-wise matrix U-statistics type loss function, with the nuclear norm for regularization. A singular value proximal gradient algorithm is developed to solve the proposed optimization problem.
arXiv Detail & Related papers (2025-04-05T01:41:53Z)
Multiple Rotation Averaging with Constrained Reweighting Deep Matrix Factorization [22.487393413405954]
Multiple rotation averaging plays a crucial role in computer vision and robotics domains. This paper proposes an effective rotation averaging method for mining data patterns in a learning manner.
arXiv Detail & Related papers (2024-09-15T16:50:27Z)
Dynamic Incremental Optimization for Best Subset Selection [15.8362578568708]
Best subset selection is considered the gold standard for many learning problems. An efficient subset-dual algorithm is developed based on the primal and dual problem structures.
arXiv Detail & Related papers (2024-02-04T02:26:40Z)
Discretize Relaxed Solution of Spectral Clustering via a Non-Heuristic Algorithm [77.53604156112144]
We develop a first-order term to bridge the original problem and discretization algorithm. Since the non-heuristic method is aware of the original graph cut problem, the final discrete solution is more reliable.
arXiv Detail & Related papers (2023-10-19T13:57:38Z)
A Novel Maximum-Entropy-Driven Technique for Low-Rank Orthogonal Nonnegative Matrix Factorization with $\ell_0$-Norm sparsity Constraint [0.0]
In data-driven control and machine learning, a common requirement involves breaking down large matrices into smaller, low-rank factors. This paper introduces an innovative solution to the orthogonal nonnegative matrix factorization (ONMF) problem. The proposed method achieves comparable or improved reconstruction errors in line with the literature.
arXiv Detail & Related papers (2022-10-06T04:30:59Z)
Best Subset Selection with Efficient Primal-Dual Algorithm [24.568094642425837]
Best subset selection is considered the gold standard' for many learning problems. In this paper, we investigate the dual forms of a family of $ell$-regularized problems. An efficient primal-dual method has been developed based on the primal and dual problem structures.
arXiv Detail & Related papers (2022-07-05T14:07:52Z)
MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms [82.90843777097606]
We propose a causally-aware imputation algorithm (MIRACLE) for missing data. MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism. We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
arXiv Detail & Related papers (2021-11-04T22:38:18Z)
Solving weakly supervised regression problem using low-rank manifold regularization [77.34726150561087]
We solve a weakly supervised regression problem. Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources. In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Multi-View Spectral Clustering with High-Order Optimal Neighborhood Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data. This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix. Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.