Fortunately, Discourse Markers Can Enhance Language Models for Sentiment
Analysis
- URL: http://arxiv.org/abs/2201.02026v1
- Date: Thu, 6 Jan 2022 12:33:47 GMT
- Title: Fortunately, Discourse Markers Can Enhance Language Models for Sentiment
Analysis
- Authors: Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin, Ranit
Aharonov and Noam Slonim
- Abstract summary: We propose to leverage sentiment-carrying discourse markers to generate large-scale weakly-labeled data.
We show the value of our approach on various benchmark datasets, including the finance domain.
- Score: 13.149482582098429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, pretrained language models have revolutionized the NLP
world, while achieving state of the art performance in various downstream
tasks. However, in many cases, these models do not perform well when labeled
data is scarce and the model is expected to perform in the zero or few shot
setting. Recently, several works have shown that continual pretraining or
performing a second phase of pretraining (inter-training) which is better
aligned with the downstream task, can lead to improved results, especially in
the scarce data setting. Here, we propose to leverage sentiment-carrying
discourse markers to generate large-scale weakly-labeled data, which in turn
can be used to adapt language models for sentiment analysis. Extensive
experimental results show the value of our approach on various benchmark
datasets, including the finance domain. Code, models and data are available at
https://github.com/ibm/tslm-discourse-markers.
Related papers
- Improving Classification Performance With Human Feedback: Label a few,
we label the rest [2.7386128680964408]
This paper focuses on understanding how a continuous feedback loop can refine models, thereby enhancing their accuracy, recall, and precision.
We benchmark this approach on the Financial Phrasebank, Banking, Craigslist, Trec, Amazon Reviews datasets to prove that with just a few labeled examples, we are able to surpass the accuracy of zero shot large language models.
arXiv Detail & Related papers (2024-01-17T19:13:05Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - A Multi-dimensional Evaluation of Tokenizer-free Multilingual Pretrained
Models [87.7086269902562]
We show that subword-based models might still be the most practical choice in many settings.
We encourage future work in tokenizer-free methods to consider these factors when designing and evaluating new models.
arXiv Detail & Related papers (2022-10-13T15:47:09Z) - Efficient Training of Language Models to Fill in the Middle [17.118891860985123]
We show that autoregressive language models can learn to infill text after we apply a straightforward transformation to the dataset.
We use these ablations to prescribe strong default settings and best practices to train FIM models.
We have released our best infilling model trained with best practices in our API, and release our infilling benchmarks to aid future research.
arXiv Detail & Related papers (2022-07-28T17:40:47Z) - Super-Prompting: Utilizing Model-Independent Contextual Data to Reduce
Data Annotation Required in Visual Commonsense Tasks [3.42658286826597]
We analyze different prompt-based fine-tuning techniques to improve results on both language and multimodal causal transformer models.
Our results show that by simple model-agnostic prompt-based fine-tuning, comparable results can be reached by only using 35%-40% of the fine-tuning training dataset.
arXiv Detail & Related papers (2022-04-25T18:56:55Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - Few-shot learning through contextual data augmentation [74.20290390065475]
Machine translation models need to adapt to new data to maintain their performance over time.
We show that adaptation on the scale of one to five examples is possible.
Our model reports better accuracy scores than a reference system trained with on average 313 parallel examples.
arXiv Detail & Related papers (2021-03-31T09:05:43Z) - Self-Training Pre-Trained Language Models for Zero- and Few-Shot
Multi-Dialectal Arabic Sequence Labeling [7.310390479801139]
Self-train pre-trained language models in zero- and few-shot scenarios to improve performance on data-scarce varieties.
Our work opens up opportunities for developing DA models exploiting only MSA resources.
arXiv Detail & Related papers (2021-01-12T21:29:30Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.