Greedy Discovery of Ordinal Factors
- URL: http://arxiv.org/abs/2302.11554v1
- Date: Sun, 19 Feb 2023 20:33:24 GMT
- Title: Greedy Discovery of Ordinal Factors
- Authors: Dominik D\"urrschnabel, Gerd Stumme
- Abstract summary: In large datasets, it is hard to discover and analyze structure.
An ordinal factor arranges a subset of the tags in a linear order based on their underlying structure.
A complete ordinal factorization, which consists of such ordinal factors, precisely represents the original dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In large datasets, it is hard to discover and analyze structure. It is thus
common to introduce tags or keywords for the items. In applications, such
datasets are then filtered based on these tags. Still, even medium-sized
datasets with a few tags result in complex and for humans hard-to-navigate
systems. In this work, we adopt the method of ordinal factor analysis to
address this problem. An ordinal factor arranges a subset of the tags in a
linear order based on their underlying structure. A complete ordinal
factorization, which consists of such ordinal factors, precisely represents the
original dataset. Based on such an ordinal factorization, we provide a way to
discover and explain relationships between different items and attributes in
the dataset. However, computing even just one ordinal factor of high
cardinality is computationally complex. We thus propose the greedy algorithm in
this work. This algorithm extracts ordinal factors using already existing fast
algorithms developed in formal concept analysis. Then, we leverage to propose a
comprehensive way to discover relationships in the dataset. We furthermore
introduce a distance measure based on the representation emerging from the
ordinal factorization to discover similar items. To evaluate the method, we
conduct a case study on different datasets.
Related papers
- Regularization-Based Methods for Ordinal Quantification [49.606912965922504]
We study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes.
We propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments.
arXiv Detail & Related papers (2023-10-13T16:04:06Z) - Relation-aware Ensemble Learning for Knowledge Graph Embedding [68.94900786314666]
We propose to learn an ensemble by leveraging existing methods in a relation-aware manner.
exploring these semantics using relation-aware ensemble leads to a much larger search space than general ensemble methods.
We propose a divide-search-combine algorithm RelEns-DSC that searches the relation-wise ensemble weights independently.
arXiv Detail & Related papers (2023-10-13T07:40:12Z) - Towards Ordinal Data Science [0.0]
Ordinal Data Science aims to establish Ordinal Data Science as a fundamentally new research agenda.
Our aim is to establish Ordinal Data Science as a fundamentally new research agenda.
arXiv Detail & Related papers (2023-07-13T14:50:04Z) - Maximal Ordinal Two-Factorizations [0.0]
We show that deciding on the existence of two-factorizations of a given size is an NP-complete problem.
We provide the algorithm Ord2Factor that allows us to compute large ordinal two-factorizations.
arXiv Detail & Related papers (2023-04-06T19:26:03Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Ordinal Causal Discovery [2.0305676256390934]
This paper proposes an identifiable ordinal causal discovery method that exploits the ordinal information contained in many real-world applications to uniquely identify the causal structure.
We show that the proposed ordinal causal discovery method has favorable and robust performance compared to state-of-the-art alternative methods in both ordinal categorical and non-categorical data.
arXiv Detail & Related papers (2022-01-19T03:11:26Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on
Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models.
We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z) - Relational Algorithms for k-means Clustering [17.552485682328772]
This paper gives a k-means approximation algorithm that is efficient in the relational algorithms model.
The running time is potentially exponentially smaller than $N$, the number of data points to be clustered that the relational database represents.
arXiv Detail & Related papers (2020-08-01T23:21:40Z) - dMelodies: A Music Dataset for Disentanglement Learning [70.90415511736089]
We present a new symbolic music dataset that will help researchers demonstrate the efficacy of their algorithms on diverse domains.
This will also provide a means for evaluating algorithms specifically designed for music.
The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning.
arXiv Detail & Related papers (2020-07-29T19:20:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.