Mining Weighted Sequential Patterns in Incremental Uncertain Databases
- URL: http://arxiv.org/abs/2404.00746v1
- Date: Sun, 31 Mar 2024 17:32:08 GMT
- Title: Mining Weighted Sequential Patterns in Incremental Uncertain Databases
- Authors: Kashob Kumar Roy, Md Hasibul Haque Moon, Md Mahmudur Rahman, Chowdhury Farhan Ahmed, Carson Kai-Sang Leung,
- Abstract summary: We have developed an algorithm to mine frequent sequences in an uncertain database.
We have proposed two new techniques for mining when the database is incremental.
- Score: 2.668038211242538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and patterns are introduced to find interesting sequences as a measure of importance. Hence, a constraint of weight needs to be handled while mining sequential patterns. Besides, due to the dynamic nature of databases, mining important information has become more challenging. Instead of mining patterns from scratch after each increment, incremental mining algorithms utilize previously mined information to update the result immediately. Several algorithms exist to mine frequent patterns and weighted sequences from incremental databases. However, these algorithms are confined to mine the precise ones. Therefore, we have developed an algorithm to mine frequent sequences in an uncertain database in this work. Furthermore, we have proposed two new techniques for mining when the database is incremental. Extensive experiments have been conducted for performance evaluation. The analysis showed the efficiency of our proposed framework.
Related papers
- RPS: A Generic Reservoir Patterns Sampler [1.09784964592609]
We introduce an approach that harnesses a weighted reservoir to facilitate direct pattern sampling from streaming batch data.
We present a generic algorithm capable of addressing temporal biases and handling various pattern types, including sequential, weighted, and unweighted itemsets.
arXiv Detail & Related papers (2024-10-31T16:25:21Z) - Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - State-Space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era [59.279784235147254]
This survey provides an in-depth summary of the latest approaches that are based on recurrent models for sequential data processing.
The emerging picture suggests that there is room for thinking of novel routes, constituted by learning algorithms which depart from the standard Backpropagation Through Time.
arXiv Detail & Related papers (2024-06-13T12:51:22Z) - Regularization-Based Methods for Ordinal Quantification [49.606912965922504]
We study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes.
We propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments.
arXiv Detail & Related papers (2023-10-13T16:04:06Z) - Towards Sequence Utility Maximization under Utility Occupancy Measure [53.234101208024335]
In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing.
We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining.
An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
arXiv Detail & Related papers (2022-12-20T17:28:53Z) - Research Trends and Applications of Data Augmentation Algorithms [77.34726150561087]
We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
arXiv Detail & Related papers (2022-07-18T11:38:32Z) - Towards Target High-Utility Itemsets [2.824395407508717]
In applied intelligence, utility-driven pattern discovery algorithms can identify insightful and useful patterns in databases.
Targeted high-utility itemset mining has emerged as a key research topic.
We propose THUIM (Targeted High-Utility Itemset Mining), which can quickly match high-utility itemsets during the mining process to select the targeted patterns.
arXiv Detail & Related papers (2022-06-09T18:42:58Z) - Incremental Mining of Frequent Serial Episodes Considering Multiple
Occurrence [11.387440344044315]
One of the fundamental research directions is to mine sequential patterns over data streams.
The pattern over a window of itemsets stream and their multiple occurrences provides additional capability to recognize the essential characteristics of the patterns.
We propose a corresponding efficient sequential miner with novel strategies to prune search space efficiently.
arXiv Detail & Related papers (2022-01-27T17:10:16Z) - Parallel algorithms for mining of frequent itemsets [0.0]
We develop a parallel method for mining of frequent itemsets on a distributed memory parallel computer.
Our method achieves speedup of 6 on 10 processors.
arXiv Detail & Related papers (2021-08-11T05:22:52Z) - Multi-source Data Mining for e-Learning [3.8673630752805432]
Pattern mining involves extracting interesting frequent patterns from data.
With the increase in the amount of data, multi-source and heterogeneous data has become an emerging challenge in this domain.
This challenge is the main focus of our work where we propose to mine multi-source data in order to extract interesting frequent patterns.
arXiv Detail & Related papers (2020-09-17T15:39:45Z) - Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field.
We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem.
We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.