Boolean Reasoning-Based Biclustering for Shifting Pattern Extraction
- URL: http://arxiv.org/abs/2104.12493v1
- Date: Mon, 26 Apr 2021 11:40:17 GMT
- Title: Boolean Reasoning-Based Biclustering for Shifting Pattern Extraction
- Authors: Marcin Michalak, Jes\'us S. Aguilar-Ruiz
- Abstract summary: Biclustering is a powerful approach to search for patterns in data, as it can be driven by a function that measures the quality of diverse types of patterns of interest.
Shifting patterns are specially interesting as they account constant fluctuations in data.
This work is presented to show that the induction of shifting patterns by means of Boolean reasoning is due to the ability of finding all inclusion--maximal delta-shifting patterns.
- Score: 0.20305676256390928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biclustering is a powerful approach to search for patterns in data, as it can
be driven by a function that measures the quality of diverse types of patterns
of interest. However, due to its computational complexity, the exploration of
the search space is usually guided by an algorithmic strategy, sometimes
introducing random factors that simplify the computational cost (e.g. greedy
search or evolutionary computation).
Shifting patterns are specially interesting as they account constant
fluctuations in data, i.e. they capture situations in which all the values in
the pattern move up or down for one dimension maintaining the range amplitude
for all the dimensions. This behaviour is very common in nature, e.g. in the
analysis of gene expression data, where a subset of genes might go up or down
for a subset of patients or experimental conditions, identifying functionally
coherent categories.
Boolean reasoning was recently revealed as an appropriate methodology to
address the search for constant biclusters. In this work, this direction is
extended to search for more general biclusters that include shifting patterns.
The mathematical foundations are described in order to associate Boolean
concepts with shifting patterns, and the methodology is presented to show that
the induction of shifting patterns by means of Boolean reasoning is due to the
ability of finding all inclusion--maximal {\delta}-shifting patterns.
Experiments with a real dataset show the potential of our approach at finding
biclusters with {\delta}-shifting patterns, which have been evaluated with the
mean squared residue (MSR), providing an excellent performance at finding
results very close to zero.
Related papers
- Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance.
We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features.
In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z) - Gradient-Based Feature Learning under Structured Data [57.76552698981579]
In the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction.
We show that appropriate weight normalization that is reminiscent of batch normalization can alleviate this issue.
In particular, under the spiked model with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent.
arXiv Detail & Related papers (2023-09-07T16:55:50Z) - Efficient Failure Pattern Identification of Predictive Algorithms [15.02620042972929]
We propose a human-machine collaborative framework that consists of a team of human annotators and a sequential recommendation algorithm.
The results empirically demonstrate the competitive performance of our framework on multiple datasets at various signal-to-noise ratios.
arXiv Detail & Related papers (2023-06-01T14:54:42Z) - Online Arbitrary Shaped Clustering through Correlated Gaussian Functions [0.0]
A novel online clustering algorithm is presented that can produce arbitrary shaped clusters from inputs in an unsupervised manner.
The algorithm can be deemed more biologically plausible than model optimization through backpropagation, although practical applicability may require additional research.
arXiv Detail & Related papers (2023-02-13T13:12:55Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - RandomSCM: interpretable ensembles of sparse classifiers tailored for
omics data [59.4141628321618]
We propose an ensemble learning algorithm based on conjunctions or disjunctions of decision rules.
The interpretability of the models makes them useful for biomarker discovery and patterns discovery in high dimensional data.
arXiv Detail & Related papers (2022-08-11T13:55:04Z) - Provable Guarantees for Sparsity Recovery with Deterministic Missing
Data Patterns [30.553697242038233]
We consider the case in which the observed dataset is censored by a deterministic, non-uniform filter.
We propose an efficient algorithm for missing value imputation by utilizing the topological property of the censorship filter.
arXiv Detail & Related papers (2022-06-10T06:14:45Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs [5.301300942803395]
Cut-based directed graph (digraph) clustering often focuses on finding dense within-cluster or sparse between-cluster connections.
For flow-based clusterings the edges between clusters tend to be oriented in one direction and have been found in migration data, food webs, and trade data.
arXiv Detail & Related papers (2022-03-02T20:07:04Z) - Cluster Analysis of a Symbolic Regression Search Space [2.055204980188575]
We take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space.
We identify unique models and cluster them based on phenotypic as well as genotypic similarity.
By mapping solution candidates visited by GP to the enumerated search space we find that GP initially explores the whole search space and later converges to the subspace of highest quality expressions.
arXiv Detail & Related papers (2021-09-28T17:50:29Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.