Characterization of anomalous diffusion through convolutional
transformers
- URL: http://arxiv.org/abs/2210.04959v1
- Date: Mon, 10 Oct 2022 18:53:13 GMT
- Title: Characterization of anomalous diffusion through convolutional
transformers
- Authors: Nicol\'as Firbas, \`Oscar Garibo-i-Orts, Miguel \'Angel Garcia-March,
J. Alberto Conejero
- Abstract summary: We propose a new transformer based neural network architecture for the characterization of anomalous diffusion.
Our new architecture, the Convolutional Transformer (ConvTransformer), uses a bi-layered convolutional neural network to extract features from our diffusive trajectories.
We show that the ConvTransformer is able to outperform the previous state of the art at determining the underlying diffusive regime in short trajectories.
- Score: 0.8984888893275713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The results of the Anomalous Diffusion Challenge (AnDi Challenge) have shown
that machine learning methods can outperform classical statistical methodology
at the characterization of anomalous diffusion in both the inference of the
anomalous diffusion exponent alpha associated with each trajectory (Task 1),
and the determination of the underlying diffusive regime which produced such
trajectories (Task 2). Furthermore, of the five teams that finished in the top
three across both tasks of the AnDi challenge, three of those teams used
recurrent neural networks (RNNs). While RNNs, like the long short-term memory
(LSTM) network, are effective at learning long-term dependencies in sequential
data, their key disadvantage is that they must be trained sequentially. In
order to facilitate training with larger data sets, by training in parallel, we
propose a new transformer based neural network architecture for the
characterization of anomalous diffusion. Our new architecture, the
Convolutional Transformer (ConvTransformer) uses a bi-layered convolutional
neural network to extract features from our diffusive trajectories that can be
thought of as being words in a sentence. These features are then fed to two
transformer encoding blocks that perform either regression or classification.
To our knowledge, this is the first time transformers have been used for
characterizing anomalous diffusion. Moreover, this may be the first time that a
transformer encoding block has been used with a convolutional neural network
and without the need for a transformer decoding block or positional encoding.
Apart from being able to train in parallel, we show that the ConvTransformer is
able to outperform the previous state of the art at determining the underlying
diffusive regime in short trajectories (length 10-50 steps), which are the most
important for experimental researchers.
Related papers
- In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent.
For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z) - Sparsifying Bayesian neural networks with latent binary variables and
normalizing flows [10.865434331546126]
We will consider two extensions to the latent binary Bayesian neural networks (LBBNN) method.
Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm.
More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian.
arXiv Detail & Related papers (2023-05-05T09:40:28Z) - Deep Transformers without Shortcuts: Modifying Self-attention for
Faithful Signal Propagation [105.22961467028234]
Skip connections and normalisation layers are ubiquitous for the training of Deep Neural Networks (DNNs)
Recent approaches such as Deep Kernel Shaping have made progress towards reducing our reliance on them.
But these approaches are incompatible with the self-attention layers present in transformers.
arXiv Detail & Related papers (2023-02-20T21:26:25Z) - DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense
Prediction [40.447092963041236]
We present a novel MTL model by combining both merits of deformable CNN and query-based Transformer.
Our method, named DeMT, is based on a simple and effective encoder-decoder architecture.
Our model uses fewer GFLOPs and significantly outperforms current Transformer- and CNN-based competitive models.
arXiv Detail & Related papers (2023-01-09T16:00:15Z) - Error Correction Code Transformer [92.10654749898927]
We propose to extend for the first time the Transformer architecture to the soft decoding of linear codes at arbitrary block lengths.
We encode each channel's output dimension to high dimension for better representation of the bits information to be processed separately.
The proposed approach demonstrates the extreme power and flexibility of Transformers and outperforms existing state-of-the-art neural decoders by large margins at a fraction of their time complexity.
arXiv Detail & Related papers (2022-03-27T15:25:58Z) - Redesigning the Transformer Architecture with Insights from
Multi-particle Dynamical Systems [32.86421107987556]
We build upon recent developments in analyzing deep neural networks as numerical solvers of ordinary differential equations.
We formulate a temporal evolution scheme, TransEvolve, to bypass costly dot-product attention over multiple stacked layers.
We perform exhaustive experiments with TransEvolve on well-known encoder-decoder as well as encoder-only tasks.
arXiv Detail & Related papers (2021-09-30T14:01:06Z) - nnFormer: Interleaved Transformer for Volumetric Segmentation [50.10441845967601]
We introduce nnFormer, a powerful segmentation model with an interleaved architecture based on empirical combination of self-attention and convolution.
nnFormer achieves tremendous improvements over previous transformer-based methods on two commonly used datasets Synapse and ACDC.
arXiv Detail & Related papers (2021-09-07T17:08:24Z) - Video Super-Resolution Transformer [85.11270760456826]
Video super-resolution (VSR), with the aim to restore a high-resolution video from its corresponding low-resolution version, is a spatial-temporal sequence prediction problem.
Recently, Transformer has been gaining popularity due to its parallel computing ability for sequence-to-sequence modeling.
In this paper, we present a spatial-temporal convolutional self-attention layer with a theoretical understanding to exploit the locality information.
arXiv Detail & Related papers (2021-06-12T20:00:32Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.