AIN: Fast and Accurate Sequence Labeling with Approximate Inference
Network
- URL: http://arxiv.org/abs/2009.08229v2
- Date: Mon, 12 Oct 2020 11:59:20 GMT
- Title: AIN: Fast and Accurate Sequence Labeling with Approximate Inference
Network
- Authors: Xinyu Wang, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei
Huang, Kewei Tu
- Abstract summary: The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches.
Exact probabilistic inference algorithms are typically applied in training and prediction stages of the CRF model.
We propose to employ a parallelizable approximate variational inference algorithm for the CRF model.
- Score: 75.44925576268052
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The linear-chain Conditional Random Field (CRF) model is one of the most
widely-used neural sequence labeling approaches. Exact probabilistic inference
algorithms such as the forward-backward and Viterbi algorithms are typically
applied in training and prediction stages of the CRF model. However, these
algorithms require sequential computation that makes parallelization
impossible. In this paper, we propose to employ a parallelizable approximate
variational inference algorithm for the CRF model. Based on this algorithm, we
design an approximate inference network that can be connected with the encoder
of the neural CRF model to form an end-to-end network, which is amenable to
parallelization for faster training and prediction. The empirical results show
that our proposed approaches achieve a 12.7-fold improvement in decoding speed
with long sentences and a competitive accuracy compared with the traditional
CRF approach.
Related papers
- Front-propagation Algorithm: Explainable AI Technique for Extracting Linear Function Approximations from Neural Networks [0.0]
This paper introduces the front-propagation algorithm, a novel AI technique designed to elucidate the decision-making logic of deep neural networks.
Unlike other popular explainability algorithms such as Integrated Gradients or Shapley Values, the proposed algorithm is able to extract an accurate and consistent linear function explanation of the network.
We demonstrate its efficacy in providing accurate linear functions with three different neural network architectures trained on publicly available benchmark datasets.
arXiv Detail & Related papers (2024-05-25T14:50:23Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Design and Prototyping Distributed CNN Inference Acceleration in Edge
Computing [85.74517957717363]
HALP accelerates inference by designing a seamless collaboration among edge devices (EDs) in Edge Computing.
Experiments show that the distributed inference HALP achieves 1.7x inference acceleration for VGG-16.
It is shown that the model selection with distributed inference HALP can significantly improve service reliability.
arXiv Detail & Related papers (2022-11-24T19:48:30Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and
Beyond [19.544213396776268]
We introduce regularized Frank-Wolfe, a general and effective CNN baseline inference for dense conditional fields.
We show that our new algorithms, with our new algorithms, with our new datasets, with significant improvements in strong strong neural networks.
arXiv Detail & Related papers (2021-10-27T20:44:47Z) - Attentive Gaussian processes for probabilistic time-series generation [4.94950858749529]
We propose a computationally efficient attention-based network combined with the Gaussian process regression to generate real-valued sequence.
We develop a block-wise training algorithm to allow mini-batch training of the network while the GP is trained using full-batch.
The algorithm has been proved to converge and shows comparable, if not better, quality of the found solution.
arXiv Detail & Related papers (2021-02-10T01:19:15Z) - Fast Convergence Algorithm for Analog Federated Learning [30.399830943617772]
We propose an AirComp-based FedSplit algorithm for efficient analog federated learning over wireless channels.
We prove that the proposed algorithm linearly converges to the optimal solutions under the assumption that the objective function is strongly convex and smooth.
Our algorithm is theoretically and experimentally verified to be much more robust to the ill-conditioned problems with faster convergence compared with other benchmark FL algorithms.
arXiv Detail & Related papers (2020-10-30T10:59:49Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.