Algorithms for Boolean Matrix Factorization using Integer Programming and Heuristics
- URL: http://arxiv.org/abs/2512.03807v2
- Date: Thu, 04 Dec 2025 08:38:45 GMT
- Title: Algorithms for Boolean Matrix Factorization using Integer Programming and Heuristics
- Authors: Christos Kolomvakis, Thomas Bobille, Arnaud Vandaele, Nicolas Gillis,
- Abstract summary: BMF approximates a given binary input matrix as the product of two smaller binary factors.<n>Unlike binary matrix factorization based on standard arithmetic, BMF employs the Boolean OR and AND operations for the matrix product.<n>It is also used in role mining and computer vision.
- Score: 11.53912933736867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Boolean matrix factorization (BMF) approximates a given binary input matrix as the product of two smaller binary factors. Unlike binary matrix factorization based on standard arithmetic, BMF employs the Boolean OR and AND operations for the matrix product, which improves interpretability and reduces the approximation error. It is also used in role mining and computer vision. In this paper, we first propose algorithms for BMF that perform alternating optimization (AO) of the factor matrices, where each subproblem is solved via integer programming (IP). We then design different approaches to further enhance AO-based algorithms by selecting an optimal subset of rank-one factors from multiple runs. To address the scalability limits of IP-based methods, we introduce new greedy and local-search heuristics. We also construct a new C++ data structure for Boolean vectors and matrices that is significantly faster than existing ones and is of independent interest, allowing our heuristics to scale to large datasets. We illustrate the performance of all our proposed methods and compare them with the state of the art on various real datasets, both with and without missing data, including applications in topic modeling and imaging.
Related papers
- MatRL: Provably Generalizable Iterative Algorithm Discovery via Monte-Carlo Tree Search [37.24058519921229]
MatRL is a reinforcement learning framework that automatically discovers iterative algorithms for computing matrix functions.<n>We show that MatRL produces algorithms that outperform various baselines in the literature.
arXiv Detail & Related papers (2025-07-04T22:57:33Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Algorithms for Boolean Matrix Factorization using Integer Programming [20.13379783488932]
We propose an alternating optimization (AO) strategy that solves the subproblem in one factor matrix in BMF.
We show how several solutions of BMF can be combined optimally using another IP.
Experiments show that our algorithms outperform the state of the art on medium-scale problems.
arXiv Detail & Related papers (2023-05-17T13:09:23Z) - A Second-Order Majorant Algorithm for Nonnegative Matrix Factorization [2.646309221150203]
We introduce a general second-order optimization framework for NMF under both quadratic and $beta$-divergence loss functions.<n>Second-Order Majorant (SOM) constructs a local quadratic majorization of the loss function by majorizing its Hessian matrix.<n>We show that mSOM consistently outperforms state-of-the-art algorithms across multiple loss functions.
arXiv Detail & Related papers (2023-03-31T12:09:36Z) - Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z) - Lossy compression of matrices by black-box optimisation of mixed-integer
non-linear programming [1.1066593559733056]
In edge computing, suppressing data size is a challenge for machine learning models that perform complex tasks such as autonomous driving.
Efficient lossy compression of matrix data has been introduced by decomposing it into the product of an integer and real matrices.
In this paper, we improve this optimisation by utilising recently developed black-box optimisation algorithms with an Ising solver for integer variables.
arXiv Detail & Related papers (2022-04-22T08:58:36Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Binary Matrix Factorisation and Completion via Integer Programming [3.4376560669160394]
We present a compact and two exponential size integer programs (IPs) for the rank-k binary matrix factorisation problem (k-BMF)
We show that the compact IP has a weak LP relaxation, while the exponential size LPs have a stronger equivalent LP relaxation.
arXiv Detail & Related papers (2021-06-25T05:17:51Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Multi-View Spectral Clustering with High-Order Optimal Neighborhood
Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data.
This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix.
Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.