Related papers: Discovering Frequent Gradual Itemsets with Imprecise Data

Discovering Frequent Gradual Itemsets with Imprecise Data

URL: http://arxiv.org/abs/2005.11045v1
Date: Fri, 22 May 2020 08:02:15 GMT
Title: Discovering Frequent Gradual Itemsets with Imprecise Data
Authors: Micha\"el Chirmeni Boujike, Jerry Lonlac, Norbert Tsopze, Engelbert Mephu Nguifo
Abstract summary: The gradual patterns that model the complex co-variations of attributes of the form "The more/less X, The more/less Y" play a crucial role in many real world applications. This paper suggests to introduce the gradualness thresholds from which to consider an increase or a decrease.
Score: 0.4874780144224056
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The gradual patterns that model the complex co-variations of attributes of the form "The more/less X, The more/less Y" play a crucial role in many real world applications where the amount of numerical data to manage is important, this is the biological data. Recently, these types of patterns have caught the attention of the data mining community, where several methods have been defined to automatically extract and manage these patterns from different data models. However, these methods are often faced the problem of managing the quantity of mined patterns, and in many practical applications, the calculation of all these patterns can prove to be intractable for the user-defined frequency threshold and the lack of focus leads to generating huge collections of patterns. Moreover another problem with the traditional approaches is that the concept of gradualness is defined just as an increase or a decrease. Indeed, a gradualness is considered as soon as the values of the attribute on both objects are different. As a result, numerous quantities of patterns extracted by traditional algorithms can be presented to the user although their gradualness is only a noise effect in the data. To address this issue, this paper suggests to introduce the gradualness thresholds from which to consider an increase or a decrease. In contrast to literature approaches, the proposed approach takes into account the distribution of attribute values, as well as the user's preferences on the gradualness threshold and makes it possible to extract gradual patterns on certain databases where literature approaches fail due to too large search space. Moreover, results from an experimental evaluation on real databases show that the proposed algorithm is scalable, efficient, and can eliminate numerous patterns that do not verify specific gradualness requirements to show a small set of patterns to the user.

Related papers

Synthetic Data for Portfolios: A Throw of the Dice Will Never Abolish Chance [0.0]
This paper contributes to a deeper understanding of the limitations of generative models, particularly in portfolio and risk management. We propose a pipeline for the generation of multivariate returns that meets conventional evaluation standards on a large universe of US equities.
arXiv Detail & Related papers (2025-01-07T18:50:24Z)
Towards Scalable and Deep Graph Neural Networks via Noise Masking [59.058558158296265]
Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. scaling them to large graphs is challenging due to the high computational and storage costs. We present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works.
arXiv Detail & Related papers (2024-12-19T07:48:14Z)
Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization [23.78498670529746]
We introduce a regularization technique to ensure that the magnitudes of the extracted features are evenly distributed. Despite its apparent simplicity, our approach has demonstrated significant performance improvements across various fine-grained visual recognition datasets.
arXiv Detail & Related papers (2024-09-03T07:32:46Z)
Pattern based learning and optimisation through pricing for bin packing problem [50.83768979636913]
We argue that when problem conditions such as the distributions of random variables change, the patterns that performed well in previous circumstances may become less effective. We propose a novel scheme to efficiently identify patterns and dynamically quantify their values for each specific condition. Our method quantifies the value of patterns based on their ability to satisfy constraints and their effects on the objective value.
arXiv Detail & Related papers (2024-08-27T17:03:48Z)
USD: Unsupervised Soft Contrastive Learning for Fault Detection in Multivariate Time Series [6.055410677780381]
We introduce a combination of data augmentation and soft contrastive learning, specifically designed to capture the multifaceted nature of state behaviors more accurately. This dual strategy significantly boosts the model's ability to distinguish between normal and abnormal states, leading to a marked improvement in fault detection performance across multiple datasets and settings.
arXiv Detail & Related papers (2024-05-25T14:48:04Z)
Latent variable model for high-dimensional point process with structured missingness [4.451479907610764]
Longitudinal data are important in numerous fields, such as healthcare, sociology and seismology. Real-world datasets can be high-dimensional, contain structured missingness patterns, and measurement time points can be governed by an unknown process. We propose a flexible and efficient latent-variable model that is capable of addressing all these limitations.
arXiv Detail & Related papers (2024-02-08T15:41:48Z)
Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution [62.71425232332837]
We show that training amortized models with noisy labels is inexpensive and surprisingly effective. This approach significantly accelerates several feature attribution and data valuation methods, often yielding an order of magnitude speedup over existing approaches.
arXiv Detail & Related papers (2024-01-29T03:42:37Z)
ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data. Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z)
Learning with Noisy labels via Self-supervised Adversarial Noisy Masking [33.87292143223425]
We propose a novel training approach termed adversarial noisy masking. It adaptively modulates the input data and label simultaneously, preventing the model to overfit noisy samples. It is tested on both synthetic and real-world noisy datasets.
arXiv Detail & Related papers (2023-02-14T03:13:26Z)
Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features. We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors. Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
Variable Skipping for Autoregressive Range Density Estimation [84.60428050170687]
We show a technique, variable skipping, for accelerating range density estimation over deep autoregressive models. We show that variable skipping provides 10-100$times$ efficiency improvements when targeting challenging high-quantile error metrics.
arXiv Detail & Related papers (2020-07-10T19:01:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.