Efficient Model Selection for Predictive Pattern Mining Model by Safe
Pattern Pruning
- URL: http://arxiv.org/abs/2306.13561v1
- Date: Fri, 23 Jun 2023 15:34:20 GMT
- Title: Efficient Model Selection for Predictive Pattern Mining Model by Safe
Pattern Pruning
- Authors: Takumi Yoshida, Hiroyuki Hanada, Kazuya Nakagawa, Kouichi Taji, Koji
Tsuda, Ichiro Takeuchi
- Abstract summary: We propose the Safe Pattern Pruning (SPP) method to address the explosion of pattern numbers in predictive pattern mining.
To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences.
- Score: 14.892080531048956
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predictive pattern mining is an approach used to construct prediction models
when the input is represented by structured data, such as sets, graphs, and
sequences. The main idea behind predictive pattern mining is to build a
prediction model by considering substructures, such as subsets, subgraphs, and
subsequences (referred to as patterns), present in the structured data as
features of the model. The primary challenge in predictive pattern mining lies
in the exponential growth of the number of patterns with the complexity of the
structured data. In this study, we propose the Safe Pattern Pruning (SPP)
method to address the explosion of pattern numbers in predictive pattern
mining. We also discuss how it can be effectively employed throughout the
entire model building process in practical data analysis. To demonstrate the
effectiveness of the proposed method, we conduct numerical experiments on
regression and classification problems involving sets, graphs, and sequences.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Approximate learning of parsimonious Bayesian context trees [0.0]
The proposed framework is tested on synthetic and real-world data examples.
It outperforms existing sequence models when fitted to real protein sequences and honeypot computer terminal sessions.
arXiv Detail & Related papers (2024-07-27T11:50:40Z) - Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Dichotomic Pattern Mining with Applications to Intent Prediction from
Semi-Structured Clickstream Datasets [10.76469643992931]
We introduce a pattern mining framework that operates on semi-structured datasets.
We show that pattern embeddings play an integrator role between semi-structured data and machine learning models.
arXiv Detail & Related papers (2022-01-23T05:00:50Z) - Regularization of Mixture Models for Robust Principal Graph Learning [0.0]
A regularized version of Mixture Models is proposed to learn a principal graph from a distribution of $D$-dimensional data points.
Parameters of the model are iteratively estimated through an Expectation-Maximization procedure.
arXiv Detail & Related papers (2021-06-16T18:00:02Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Pattern Similarity-based Machine Learning Methods for Mid-term Load
Forecasting: A Comparative Study [0.0]
We use pattern similarity-based methods for forecasting monthly electricity demand expressing annual seasonality.
An integral part of the models is the time series representation using patterns of time series sequences.
We consider four such models: nearest neighbor model, fuzzy neighborhood model, kernel regression model and general regression neural network.
arXiv Detail & Related papers (2020-03-03T12:14:36Z) - Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset.
To estimate the model parameters, an Alternating Least Squares algorithm is developed.
The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.