Neural Transfer Learning with Transformers for Social Science Text
Analysis
- URL: http://arxiv.org/abs/2102.02111v1
- Date: Wed, 3 Feb 2021 15:41:20 GMT
- Title: Neural Transfer Learning with Transformers for Social Science Text
Analysis
- Authors: Sandra Wankm\"uller
- Abstract summary: Transformer-based models for transfer learning have the potential to achieve higher prediction accuracies with relatively few training data instances.
This paper explains how these methods work, why they might be advantageous, and what their limitations are.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: During the last years, there have been substantial increases in the
prediction performances of natural language processing models on text-based
supervised learning tasks. Especially deep learning models that are based on
the Transformer architecture (Vaswani et al., 2017) and are used in a transfer
learning setting have contributed to this development. As Transformer-based
models for transfer learning have the potential to achieve higher prediction
accuracies with relatively few training data instances, they are likely to
benefit social scientists that seek to have as accurate as possible text-based
measures but only have limited resources for annotating training data. To
enable social scientists to leverage these potential benefits for their
research, this paper explains how these methods work, why they might be
advantageous, and what their limitations are. Additionally, three
Transformer-based models for transfer learning, BERT (Devlin et al., 2019),
RoBERTa (Liu et al., 2019), and the Longformer (Beltagy et al., 2020), are
compared to conventional machine learning algorithms on three social science
applications. Across all evaluated tasks, textual styles, and training data set
sizes, the conventional models are consistently outperformed by transfer
learning with Transformer-based models, thereby demonstrating the potential
benefits these models can bring to text-based social science research.
Related papers
- Transformers for Supervised Online Continual Learning [11.270594318662233]
We propose a method that leverages transformers' in-context learning capabilities for online continual learning.
Our method demonstrates significant improvements over previous state-of-the-art results on CLOC, a challenging large-scale real-world benchmark for image geo-localization.
arXiv Detail & Related papers (2024-03-03T16:12:20Z) - Cheap Learning: Maximising Performance of Language Models for Social
Data Science Using Minimal Data [1.8692054990918079]
We review three cheap' techniques that have developed in recent years: weak supervision, transfer learning and prompt engineering.
For the latter, we review the particular case of zero-shot prompting of large language models.
We show good performance for all techniques, and in particular we demonstrate how prompting of large language models can achieve high accuracy at very low cost.
arXiv Detail & Related papers (2024-01-22T19:00:11Z) - Few-shot learning for automated content analysis: Efficient coding of
arguments and claims in the debate on arms deliveries to Ukraine [0.9576975587953563]
Pre-trained language models (PLM) based on transformer neural networks offer great opportunities to improve automatic content analysis in communication science.
Three characteristics so far impeded the widespread adoption of the methods in the applying disciplines: the dominance of English language models in NLP research, the necessary computing resources, and the effort required to produce training data to fine-tune PLMs.
We test our approach on a realistic use case from communication science to automatically detect claims and arguments together with their stance in the German news debate on arms deliveries to Ukraine.
arXiv Detail & Related papers (2023-12-28T11:39:08Z) - Supervised Pretraining Can Learn In-Context Reinforcement Learning [96.62869749926415]
In this paper, we study the in-context learning capabilities of transformers in decision-making problems.
We introduce and study Decision-Pretrained Transformer (DPT), a supervised pretraining method where the transformer predicts an optimal action.
We find that the pretrained transformer can be used to solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline.
arXiv Detail & Related papers (2023-06-26T17:58:50Z) - Learning to Grow Pretrained Models for Efficient Transformer Training [72.20676008625641]
We learn to grow pretrained transformers, where we learn to linearly map the parameters of the smaller model to initialize the larger model.
Experiments across both language and vision transformers demonstrate that our learned Linear Growth Operator (LiGO) can save up to 50% computational cost of training from scratch.
arXiv Detail & Related papers (2023-03-02T05:21:18Z) - Constructing Effective Machine Learning Models for the Sciences: A
Multidisciplinary Perspective [77.53142165205281]
We show how flexible non-linear solutions will not always improve upon manually adding transforms and interactions between variables to linear regression models.
We discuss how to recognize this before constructing a data-driven model and how such analysis can help us move to intrinsically interpretable regression models.
arXiv Detail & Related papers (2022-11-21T17:48:44Z) - BERT WEAVER: Using WEight AVERaging to enable lifelong learning for
transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model.
We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z) - Transformers for prompt-level EMA non-response prediction [62.41658786277712]
Ecological Momentary Assessments (EMAs) are an important psychological data source for measuring cognitive states, affect, behavior, and environmental factors.
Non-response, in which participants fail to respond to EMA prompts, is an endemic problem.
The ability to accurately predict non-response could be utilized to improve EMA delivery and develop compliance interventions.
arXiv Detail & Related papers (2021-11-01T18:38:47Z) - What is being transferred in transfer learning? [51.6991244438545]
We show that when training from pre-trained weights, the model stays in the same basin in the loss landscape.
We present that when training from pre-trained weights, the model stays in the same basin in the loss landscape and different instances of such model are similar in feature space and close in parameter space.
arXiv Detail & Related papers (2020-08-26T17:23:40Z) - On the comparability of Pre-trained Language Models [0.0]
Recent developments in unsupervised representation learning have successfully established the concept of transfer learning in NLP.
More elaborated architectures are making better use of contextual information.
Larger corpora are used as resources for pre-training large language models in a self-supervised fashion.
Advances in parallel computing as well as in cloud computing made it possible to train these models with growing capacities in the same or even in shorter time than previously established models.
arXiv Detail & Related papers (2020-01-03T10:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.