Universal Transformer Hawkes Process with Adaptive Recursive Iteration
- URL: http://arxiv.org/abs/2112.14479v1
- Date: Wed, 29 Dec 2021 09:55:12 GMT
- Title: Universal Transformer Hawkes Process with Adaptive Recursive Iteration
- Authors: Lu-ning Zhang, Jian-wei Liu, Zhi-yan Song, Xin Zuo
- Abstract summary: Asynchronous events sequences are widely distributed in the natural world and human activities, such as earthquakes records, users activities in social media and so on.
How to distill the information from these seemingly disorganized data is a persistent topic that researchers focus on.
The one of the most useful model is the point process model, and on the basis, the researchers obtain many noticeable results.
In recent years, point process models on the foundation of neural networks, especially recurrent neural networks (RNN) are proposed and compare with the traditional models, their performance are greatly improved.
- Score: 4.624987488467739
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Asynchronous events sequences are widely distributed in the natural world and
human activities, such as earthquakes records, users activities in social media
and so on. How to distill the information from these seemingly disorganized
data is a persistent topic that researchers focus on. The one of the most
useful model is the point process model, and on the basis, the researchers
obtain many noticeable results. Moreover, in recent years, point process models
on the foundation of neural networks, especially recurrent neural networks
(RNN) are proposed and compare with the traditional models, their performance
are greatly improved. Enlighten by transformer model, which can learning
sequence data efficiently without recurrent and convolutional structure,
transformer Hawkes process is come out, and achieves state-of-the-art
performance. However, there is some research proving that the re-introduction
of recursive calculations in transformer can further improve transformers
performance. Thus, we come out with a new kind of transformer Hawkes process
model, universal transformer Hawkes process (UTHP), which contains both
recursive mechanism and self-attention mechanism, and to improve the local
perception ability of the model, we also introduce convolutional neural network
(CNN) in the position-wise-feed-forward part. We conduct experiments on several
datasets to validate the effectiveness of UTHP and explore the changes after
the introduction of the recursive mechanism. These experiments on multiple
datasets demonstrate that the performance of our proposed new model has a
certain improvement compared with the previous state-of-the-art models.
Related papers
- Differential Evolution Algorithm based Hyper-Parameters Selection of
Transformer Neural Network Model for Load Forecasting [0.0]
Transformer models have the potential to improve Load forecasting because of their ability to learn long-range dependencies derived from their Attention Mechanism.
Our work compares the proposed Transformer based Neural Network model integrated with different metaheuristic algorithms by their performance in Load forecasting based on numerical metrics such as Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE)
arXiv Detail & Related papers (2023-07-28T04:29:53Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - Temporal Attention Augmented Transformer Hawkes Process [4.624987488467739]
We come up with a new kind of Transformer-based Hawkes process model, Temporal Attention Augmented Transformer Hawkes Process (TAA-THP)
We modify the traditional dot-product attention structure, and introduce the temporal encoding into attention structure.
We conduct numerous experiments on a wide range of synthetic and real-life datasets to validate the performance of our proposed TAA-THP model.
arXiv Detail & Related papers (2021-12-29T09:45:23Z) - Transformers predicting the future. Applying attention in next-frame and
time series forecasting [0.0]
Recurrent Neural Networks were, until recently, one of the best ways to capture the timely dependencies in sequences.
With the introduction of the Transformer, it has been proven that an architecture with only attention-mechanisms without any RNN can improve on the results in various sequence processing tasks.
arXiv Detail & Related papers (2021-08-18T16:17:29Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - Visformer: The Vision-friendly Transformer [105.52122194322592]
We propose a new architecture named Visformer, which is abbreviated from the Vision-friendly Transformer'
With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy.
arXiv Detail & Related papers (2021-04-26T13:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.