End to End Binarized Neural Networks for Text Classification
- URL: http://arxiv.org/abs/2010.05223v1
- Date: Sun, 11 Oct 2020 11:21:53 GMT
- Title: End to End Binarized Neural Networks for Text Classification
- Authors: Harshil Jain, Akshat Agarwal, Kumar Shridhar, Denis Kleyko
- Abstract summary: We propose an end to end binarized neural network architecture for the intent classification task.
The proposed architecture achieves comparable to the state-of-the-art results on standard intent classification datasets.
- Score: 4.046236197219608
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks have demonstrated their superior performance in almost
every Natural Language Processing task, however, their increasing complexity
raises concerns. In particular, these networks require high expenses on
computational hardware, and training budget is a concern for many. Even for a
trained network, the inference phase can be too demanding for
resource-constrained devices, thus limiting its applicability. The
state-of-the-art transformer models are a vivid example. Simplifying the
computations performed by a network is one way of relaxing the complexity
requirements. In this paper, we propose an end to end binarized neural network
architecture for the intent classification task. In order to fully utilize the
potential of end to end binarization, both input representations (vector
embeddings of tokens statistics) and the classifier are binarized. We
demonstrate the efficiency of such architecture on the intent classification of
short texts over three datasets and for text classification with a larger
dataset. The proposed architecture achieves comparable to the state-of-the-art
results on standard intent classification datasets while utilizing ~ 20-40%
lesser memory and training time. Furthermore, the individual components of the
architecture, such as binarized vector embeddings of documents or binarized
classifiers, can be used separately with not necessarily fully binary
architectures.
Related papers
- Homological Convolutional Neural Networks [4.615338063719135]
We propose a novel deep learning architecture that exploits the data structural organization through topologically constrained network representations.
We test our model on 18 benchmark datasets against 5 classic machine learning and 3 deep learning models.
arXiv Detail & Related papers (2023-08-26T08:48:51Z) - Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - Convolution, aggregation and attention based deep neural networks for
accelerating simulations in mechanics [1.0154623955833253]
We demonstrate three types of neural network architectures for efficient learning of deformations of solid bodies.
The first two are based on the recently proposed CNN U-NET and MAgNET frameworks which have shown promising performance for learning on mesh-based data.
The third architecture is Perceiver IO, a very recent architecture that belongs to the family of attention-based neural networks.
arXiv Detail & Related papers (2022-12-01T13:10:56Z) - Investigating Neural Architectures by Synthetic Dataset Design [14.317837518705302]
Recent years have seen the emergence of many new neural network structures (architectures and layers)
We sketch a methodology to measure the effect of each structure on a network's ability, by designing ad hoc synthetic datasets.
We illustrate our methodology by building three datasets to evaluate each of the three following network properties.
arXiv Detail & Related papers (2022-04-23T10:50:52Z) - An Efficient End-to-End 3D Model Reconstruction based on Neural
Architecture Search [5.913946292597174]
We propose an efficient model reconstruction method utilizing neural architecture search (NAS) and binary classification.
Our method achieves significantly higher reconstruction accuracy using fewer network parameters.
arXiv Detail & Related papers (2022-02-27T08:53:43Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Neural Network Layer Algebra: A Framework to Measure Capacity and
Compression in Deep Learning [0.0]
We present a new framework to measure the intrinsic properties of (deep) neural networks.
While we focus on convolutional networks, our framework can be extrapolated to any network architecture.
arXiv Detail & Related papers (2021-07-02T13:43:53Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.