Binary Matrix Factorisation via Column Generation
- URL: http://arxiv.org/abs/2011.04457v3
- Date: Tue, 3 Aug 2021 20:49:16 GMT
- Title: Binary Matrix Factorisation via Column Generation
- Authors: Reka A. Kovacs, Oktay Gunluk, Raphael A. Hauser
- Abstract summary: We consider the problem of low-rank binary matrix factorisation (BMF) under arithmetic.
We formulate the problem as a mixed integer linear program and use a large scale optimisation technique of column generation to solve it.
Experimental results on real world datasets demonstrate that our proposed method is effective at producing highly accurate factorisations.
- Score: 6.445605125467574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identifying discrete patterns in binary data is an important dimensionality
reduction tool in machine learning and data mining. In this paper, we consider
the problem of low-rank binary matrix factorisation (BMF) under Boolean
arithmetic. Due to the hardness of this problem, most previous attempts rely on
heuristic techniques. We formulate the problem as a mixed integer linear
program and use a large scale optimisation technique of column generation to
solve it without the need of heuristic pattern mining. Our approach focuses on
accuracy and on the provision of optimality guarantees. Experimental results on
real world datasets demonstrate that our proposed method is effective at
producing highly accurate factorisations and improves on the previously
available best known results for 15 out of 24 problem instances.
Related papers
- Multiple Rotation Averaging with Constrained Reweighting Deep Matrix Factorization [22.487393413405954]
Multiple rotation averaging plays a crucial role in computer vision and robotics domains.
This paper proposes an effective rotation averaging method for mining data patterns in a learning manner.
arXiv Detail & Related papers (2024-09-15T16:50:27Z) - Dynamic Incremental Optimization for Best Subset Selection [15.8362578568708]
Best subset selection is considered the gold standard for many learning problems.
An efficient subset-dual algorithm is developed based on the primal and dual problem structures.
arXiv Detail & Related papers (2024-02-04T02:26:40Z) - Discretize Relaxed Solution of Spectral Clustering via a Non-Heuristic
Algorithm [77.53604156112144]
We develop a first-order term to bridge the original problem and discretization algorithm.
Since the non-heuristic method is aware of the original graph cut problem, the final discrete solution is more reliable.
arXiv Detail & Related papers (2023-10-19T13:57:38Z) - A Novel Maximum-Entropy-Driven Technique for Low-Rank Orthogonal
Nonnegative Matrix Factorization with $\ell_0$-Norm sparsity Constraint [0.0]
In data-driven control and machine learning, a common requirement involves breaking down large matrices into smaller, low-rank factors.
This paper introduces an innovative solution to the orthogonal nonnegative matrix factorization (ONMF) problem.
The proposed method achieves comparable or improved reconstruction errors in line with the literature.
arXiv Detail & Related papers (2022-10-06T04:30:59Z) - Best Subset Selection with Efficient Primal-Dual Algorithm [24.568094642425837]
Best subset selection is considered the gold standard' for many learning problems.
In this paper, we investigate the dual forms of a family of $ell$-regularized problems.
An efficient primal-dual method has been developed based on the primal and dual problem structures.
arXiv Detail & Related papers (2022-07-05T14:07:52Z) - MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms [82.90843777097606]
We propose a causally-aware imputation algorithm (MIRACLE) for missing data.
MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism.
We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
arXiv Detail & Related papers (2021-11-04T22:38:18Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Multi-View Spectral Clustering with High-Order Optimal Neighborhood
Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data.
This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix.
Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.