Reducing Computational Costs in Sentiment Analysis: Tensorized Recurrent
Networks vs. Recurrent Networks
- URL: http://arxiv.org/abs/2306.09705v1
- Date: Fri, 16 Jun 2023 09:18:08 GMT
- Title: Reducing Computational Costs in Sentiment Analysis: Tensorized Recurrent
Networks vs. Recurrent Networks
- Authors: Gabriel Lopez, Anna Nguyen, Joe Kaul
- Abstract summary: Anticipating audience reaction towards a certain text is integral to several facets of society ranging from politics, research, and commercial industries.
Sentiment analysis (SA) is a useful natural language processing (NLP) technique that utilizes lexical/statistical and deep learning methods to determine whether different-sized texts exhibit positive, negative, or neutral emotions.
- Score: 0.12891210250935145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anticipating audience reaction towards a certain text is integral to several
facets of society ranging from politics, research, and commercial industries.
Sentiment analysis (SA) is a useful natural language processing (NLP) technique
that utilizes lexical/statistical and deep learning methods to determine
whether different-sized texts exhibit positive, negative, or neutral emotions.
Recurrent networks are widely used in machine-learning communities for problems
with sequential data. However, a drawback of models based on Long-Short Term
Memory networks and Gated Recurrent Units is the significantly high number of
parameters, and thus, such models are computationally expensive. This drawback
is even more significant when the available data are limited. Also, such models
require significant over-parameterization and regularization to achieve optimal
performance. Tensorized models represent a potential solution. In this paper,
we classify the sentiment of some social media posts. We compare traditional
recurrent models with their tensorized version, and we show that with the
tensorized models, we reach comparable performances with respect to the
traditional models while using fewer resources for the training.
Related papers
- A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - A Systematic Approach to Robustness Modelling for Deep Convolutional
Neural Networks [0.294944680995069]
Recent work raises questions about the ability for even larger models to generalize to data outside of the controlled train and test sets.
We provide a method that uses induced failures to model the probability of failure as a function of time.
We examine the various trade-offs between cost, robustness, latency, and reliability to find that larger models do not significantly aid in adversarial robustness.
arXiv Detail & Related papers (2024-01-24T19:12:37Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Interaction Decompositions for Tensor Network Regression [0.0]
We show how to assess the relative importance of different regressors as a function of their degree.
We introduce a new type of tensor network model that is explicitly trained on only a small subset of interaction degrees.
This suggests that standard tensor network models utilize their regressors in an inefficient manner, with the lower degree terms vastly underutilized.
arXiv Detail & Related papers (2022-08-11T20:17:27Z) - On the Versatile Uses of Partial Distance Correlation in Deep Learning [47.11577420740119]
This paper revisits a (less widely known) from statistics, called distance correlation (and its partial variant), designed to evaluate correlation between feature spaces of different dimensions.
We describe the steps necessary to carry out its deployment for large scale models.
This opens the door to a surprising array of applications ranging from conditioning one deep model w.r.t. another, learning disentangled representations as well as optimizing diverse models that would directly be more robust to adversarial attacks.
arXiv Detail & Related papers (2022-07-20T06:36:11Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Multi-fidelity regression using artificial neural networks: efficient
approximation of parameter-dependent output quantities [0.17499351967216337]
We present the use of artificial neural networks applied to multi-fidelity regression problems.
The introduced models are compared against a traditional multi-fidelity scheme, co-kriging.
We also show an application of multi-fidelity regression to an engineering problem.
arXiv Detail & Related papers (2021-02-26T11:29:00Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Learning from Context or Names? An Empirical Study on Neural Relation
Extraction [112.06614505580501]
We study the effect of two main information sources in text: textual context and entity mentions (names)
We propose an entity-masked contrastive pre-training framework for relation extraction (RE)
Our framework can improve the effectiveness and robustness of neural models in different RE scenarios.
arXiv Detail & Related papers (2020-10-05T11:21:59Z) - Prediction of Hilbertian autoregressive processes : a Recurrent Neural
Network approach [0.0]
We propose here to compare the classical prediction methodology based on the estimation of the autocorrelation operator with a neural network learning approach.
The latter is based on a popular version of Recurrent Neural Networks : the Long Short Term Memory networks.
arXiv Detail & Related papers (2020-08-25T16:43:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.