Related papers: Boolean Reasoning-Based Biclustering for Shifting Pattern Extraction

Boolean Reasoning-Based Biclustering for Shifting Pattern Extraction

URL: http://arxiv.org/abs/2104.12493v1
Date: Mon, 26 Apr 2021 11:40:17 GMT
Title: Boolean Reasoning-Based Biclustering for Shifting Pattern Extraction
Authors: Marcin Michalak, Jes\'us S. Aguilar-Ruiz
Abstract summary: Biclustering is a powerful approach to search for patterns in data, as it can be driven by a function that measures the quality of diverse types of patterns of interest. Shifting patterns are specially interesting as they account constant fluctuations in data. This work is presented to show that the induction of shifting patterns by means of Boolean reasoning is due to the ability of finding all inclusion--maximal delta-shifting patterns.
Score: 0.20305676256390928
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Biclustering is a powerful approach to search for patterns in data, as it can be driven by a function that measures the quality of diverse types of patterns of interest. However, due to its computational complexity, the exploration of the search space is usually guided by an algorithmic strategy, sometimes introducing random factors that simplify the computational cost (e.g. greedy search or evolutionary computation). Shifting patterns are specially interesting as they account constant fluctuations in data, i.e. they capture situations in which all the values in the pattern move up or down for one dimension maintaining the range amplitude for all the dimensions. This behaviour is very common in nature, e.g. in the analysis of gene expression data, where a subset of genes might go up or down for a subset of patients or experimental conditions, identifying functionally coherent categories. Boolean reasoning was recently revealed as an appropriate methodology to address the search for constant biclusters. In this work, this direction is extended to search for more general biclusters that include shifting patterns. The mathematical foundations are described in order to associate Boolean concepts with shifting patterns, and the methodology is presented to show that the induction of shifting patterns by means of Boolean reasoning is due to the ability of finding all inclusion--maximal {\delta}-shifting patterns. Experiments with a real dataset show the potential of our approach at finding biclusters with {\delta}-shifting patterns, which have been evaluated with the mean squared residue (MSR), providing an excellent performance at finding results very close to zero.

Related papers

A Mathematical Perspective On Contrastive Learning [5.66952471288857]
Multimodal contrastive learning is a methodology for linking different data modalities.<n>We focus on the bimodal setting and interpret contrastive learning as the optimization of encoders that define conditional probability distributions.
arXiv Detail & Related papers (2025-05-30T02:09:37Z)
Interpetable Target-Feature Aggregation for Multi-Task Learning based on Bias-Variance Analysis [53.38518232934096]
Multi-task learning (MTL) is a powerful machine learning paradigm designed to leverage shared knowledge across tasks to improve generalization and performance. We propose an MTL approach at the intersection between task clustering and feature transformation based on a two-phase iterative aggregation of targets and features. In both phases, a key aspect is to preserve the interpretability of the reduced targets and features through the aggregation with the mean, which is motivated by applications to Earth science.
arXiv Detail & Related papers (2024-06-12T08:30:16Z)
Sample, estimate, aggregate: A recipe for causal discovery foundation models [28.116832159265964]
Causal discovery has the potential to uncover mechanistic insights from biological experiments. We propose a supervised model trained on large-scale, synthetic data to predict causal graphs. Our approach is enabled by the observation that typical errors in the outputs of a discovery algorithm remain comparable across datasets.
arXiv Detail & Related papers (2024-02-02T21:57:58Z)
Gradient-Based Feature Learning under Structured Data [57.76552698981579]
In the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction. We show that appropriate weight normalization that is reminiscent of batch normalization can alleviate this issue. In particular, under the spiked model with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent.
arXiv Detail & Related papers (2023-09-07T16:55:50Z)
Efficient Failure Pattern Identification of Predictive Algorithms [15.02620042972929]
We propose a human-machine collaborative framework that consists of a team of human annotators and a sequential recommendation algorithm. The results empirically demonstrate the competitive performance of our framework on multiple datasets at various signal-to-noise ratios.
arXiv Detail & Related papers (2023-06-01T14:54:42Z)
Online Arbitrary Shaped Clustering through Correlated Gaussian Functions [0.0]
A novel online clustering algorithm is presented that can produce arbitrary shaped clusters from inputs in an unsupervised manner. The algorithm can be deemed more biologically plausible than model optimization through backpropagation, although practical applicability may require additional research.
arXiv Detail & Related papers (2023-02-13T13:12:55Z)
Learning to Bound Counterfactual Inference in Structural Causal Models from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm. The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources. It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z)
RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data [59.4141628321618]
We propose an ensemble learning algorithm based on conjunctions or disjunctions of decision rules. The interpretability of the models makes them useful for biomarker discovery and patterns discovery in high dimensional data.
arXiv Detail & Related papers (2022-08-11T13:55:04Z)
Provable Guarantees for Sparsity Recovery with Deterministic Missing Data Patterns [30.553697242038233]
We consider the case in which the observed dataset is censored by a deterministic, non-uniform filter. We propose an efficient algorithm for missing value imputation by utilizing the topological property of the censorship filter.
arXiv Detail & Related papers (2022-06-10T06:14:45Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs [5.301300942803395]
Cut-based directed graph (digraph) clustering often focuses on finding dense within-cluster or sparse between-cluster connections. For flow-based clusterings the edges between clusters tend to be oriented in one direction and have been found in migration data, food webs, and trade data.
arXiv Detail & Related papers (2022-03-02T20:07:04Z)
Cluster Analysis of a Symbolic Regression Search Space [2.055204980188575]
We take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. We identify unique models and cluster them based on phenotypic as well as genotypic similarity. By mapping solution candidates visited by GP to the enumerated search space we find that GP initially explores the whole search space and later converges to the subspace of highest quality expressions.
arXiv Detail & Related papers (2021-09-28T17:50:29Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.