A Systematic Comparison of Architectures for Document-Level Sentiment
Classification
- URL: http://arxiv.org/abs/2002.08131v2
- Date: Wed, 2 Feb 2022 13:26:52 GMT
- Title: A Systematic Comparison of Architectures for Document-Level Sentiment
Classification
- Authors: Jeremy Barnes and Vinit Ravishankar and Lilja {\O}vrelid and Erik
Velldal
- Abstract summary: We compare hierarchical models and transfer learning for document-level sentiment classification.
We show that non-trivial hierarchical models outperform previous baselines and transfer learning on document-level sentiment classification in five languages.
- Score: 14.670220716382515
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Documents are composed of smaller pieces - paragraphs, sentences, and tokens
- that have complex relationships between one another. Sentiment classification
models that take into account the structure inherent in these documents have a
theoretical advantage over those that do not. At the same time, transfer
learning models based on language model pretraining have shown promise for
document classification. However, these two paradigms have not been
systematically compared and it is not clear under which circumstances one
approach is better than the other. In this work we empirically compare
hierarchical models and transfer learning for document-level sentiment
classification. We show that non-trivial hierarchical models outperform
previous baselines and transfer learning on document-level sentiment
classification in five languages.
Related papers
- Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - Inspecting class hierarchies in classification-based metric learning
models [0.0]
We train a softmax classifier and three metric learning models with several training options on benchmark and real-world datasets.
We evaluate the hierarchical inference performance by inspecting learned class representatives and the hierarchy-informed performance, i.e., the classification performance, and the metric learning performance by considering predefined hierarchical structures.
arXiv Detail & Related papers (2023-01-26T12:40:12Z) - Unsupervised Document Embedding via Contrastive Augmentation [48.71917352110245]
We present a contrasting learning approach with data augmentation techniques to learn document representations in unsupervised manner.
Inspired by recent contrastive self-supervised learning algorithms used for image and pretraining, we hypothesize that high-quality document embedding should be invariant to diverse paraphrases.
Our method can decrease the classification error rate by up to 6.4% over the SOTA approaches on the document classification task, matching or even surpassing fully-supervised methods.
arXiv Detail & Related papers (2021-03-26T15:48:52Z) - Improving Document-Level Sentiment Classification Using Importance of
Sentences [3.007949058551534]
We propose a document-level sentence classification model based on deep neural networks.
We conduct experiments using the sentiment datasets in the four different domains such as movie reviews, hotel reviews, restaurant reviews, and music reviews.
The experimental results show that the importance of sentences should be considered in a document-level sentiment classification task.
arXiv Detail & Related papers (2021-03-09T01:29:08Z) - A Comparison of Approaches to Document-level Machine Translation [34.2276281264886]
This paper presents a systematic comparison of selected approaches to document-level phenomena evaluation suites.
We find that a simple method based purely on back-translating monolingual document-level data performs as well as much more elaborate alternatives.
arXiv Detail & Related papers (2021-01-26T19:21:09Z) - Text Classification Using Label Names Only: A Language Model
Self-Training Approach [80.63885282358204]
Current text classification methods typically require a good number of human-labeled documents as training data.
We show that our model achieves around 90% accuracy on four benchmark datasets including topic and sentiment classification.
arXiv Detail & Related papers (2020-10-14T17:06:41Z) - Aspect-based Document Similarity for Research Papers [4.661692753666685]
We extend similarity with aspect information by performing a pairwise document classification task.
We evaluate our aspect-based document similarity for research papers.
Our results show SciBERT as the best performing system.
arXiv Detail & Related papers (2020-10-13T13:51:21Z) - SPECTER: Document-level Representation Learning using Citation-informed
Transformers [51.048515757909215]
SPECTER generates document-level embedding of scientific documents based on pretraining a Transformer language model.
We introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction to document classification and recommendation.
arXiv Detail & Related papers (2020-04-15T16:05:51Z) - Pairwise Multi-Class Document Classification for Semantic Relations
between Wikipedia Articles [5.40541521227338]
We model the problem of finding the relationship between two documents as a pairwise document classification task.
To find semantic relation between documents, we apply a series of techniques, such as GloVe, paragraph-s, BERT, and XLNet.
We perform our experiments on a newly proposed dataset of 32,168 Wikipedia article pairs and Wikidata properties that define the semantic document relations.
arXiv Detail & Related papers (2020-03-22T12:52:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.