Related papers: AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network

AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network

URL: http://arxiv.org/abs/2009.08229v2
Date: Mon, 12 Oct 2020 11:59:20 GMT
Title: AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network
Authors: Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu
Abstract summary: The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches. Exact probabilistic inference algorithms are typically applied in training and prediction stages of the CRF model. We propose to employ a parallelizable approximate variational inference algorithm for the CRF model.
Score: 75.44925576268052
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches. Exact probabilistic inference algorithms such as the forward-backward and Viterbi algorithms are typically applied in training and prediction stages of the CRF model. However, these algorithms require sequential computation that makes parallelization impossible. In this paper, we propose to employ a parallelizable approximate variational inference algorithm for the CRF model. Based on this algorithm, we design an approximate inference network that can be connected with the encoder of the neural CRF model to form an end-to-end network, which is amenable to parallelization for faster training and prediction. The empirical results show that our proposed approaches achieve a 12.7-fold improvement in decoding speed with long sentences and a competitive accuracy compared with the traditional CRF approach.

Related papers

Front-propagation Algorithm: Explainable AI Technique for Extracting Linear Function Approximations from Neural Networks [0.0]
This paper introduces the front-propagation algorithm, a novel AI technique designed to elucidate the decision-making logic of deep neural networks. Unlike other popular explainability algorithms such as Integrated Gradients or Shapley Values, the proposed algorithm is able to extract an accurate and consistent linear function explanation of the network. We demonstrate its efficacy in providing accurate linear functions with three different neural network architectures trained on publicly available benchmark datasets.
arXiv Detail & Related papers (2024-05-25T14:50:23Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing [85.74517957717363]
HALP accelerates inference by designing a seamless collaboration among edge devices (EDs) in Edge Computing. Experiments show that the distributed inference HALP achieves 1.7x inference acceleration for VGG-16. It is shown that the model selection with distributed inference HALP can significantly improve service reliability.
arXiv Detail & Related papers (2022-11-24T19:48:30Z)
Scalable computation of prediction intervals for neural networks via matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure. This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z)
Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond [19.544213396776268]
We introduce regularized Frank-Wolfe, a general and effective CNN baseline inference for dense conditional fields. We show that our new algorithms, with our new algorithms, with our new datasets, with significant improvements in strong strong neural networks.
arXiv Detail & Related papers (2021-10-27T20:44:47Z)
Attentive Gaussian processes for probabilistic time-series generation [4.94950858749529]
We propose a computationally efficient attention-based network combined with the Gaussian process regression to generate real-valued sequence. We develop a block-wise training algorithm to allow mini-batch training of the network while the GP is trained using full-batch. The algorithm has been proved to converge and shows comparable, if not better, quality of the found solution.
arXiv Detail & Related papers (2021-02-10T01:19:15Z)
Fast Convergence Algorithm for Analog Federated Learning [30.399830943617772]
We propose an AirComp-based FedSplit algorithm for efficient analog federated learning over wireless channels. We prove that the proposed algorithm linearly converges to the optimal solutions under the assumption that the objective function is strongly convex and smooth. Our algorithm is theoretically and experimentally verified to be much more robust to the ill-conditioned problems with faster convergence compared with other benchmark FL algorithms.
arXiv Detail & Related papers (2020-10-30T10:59:49Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.