Transfer Learning for Improving Results on Russian Sentiment Datasets
- URL: http://arxiv.org/abs/2107.02499v1
- Date: Tue, 6 Jul 2021 09:31:36 GMT
- Title: Transfer Learning for Improving Results on Russian Sentiment Datasets
- Authors: Anton Golubev and Natalia Loukachevitch
- Abstract summary: The best results were achieved using three-step approach of sequential training on general, thematic and original train samples.
The BERT-NLI model treating sentiment classification problem as a natural language inference task reached the human level of sentiment analysis on one of the datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we test transfer learning approach on Russian sentiment
benchmark datasets using additional train sample created with distant
supervision technique. We compare several variants of combining additional data
with benchmark train samples. The best results were achieved using three-step
approach of sequential training on general, thematic and original train
samples. For most datasets, the results were improved by more than 3% to the
current state-of-the-art methods. The BERT-NLI model treating sentiment
classification problem as a natural language inference task reached the human
level of sentiment analysis on one of the datasets.
Related papers
- How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Dataset Quantization with Active Learning based Adaptive Sampling [11.157462442942775]
We show that maintaining performance is feasible even with uneven sample distributions.
We propose a novel active learning based adaptive sampling strategy to optimize the sample selection.
Our approach outperforms the state-of-the-art dataset compression methods.
arXiv Detail & Related papers (2024-07-09T23:09:18Z) - A Cross-Dataset Study for Text-based 3D Human Motion Retrieval [13.673377919543228]
We employ a unified SMPL body format for all datasets, which allows us to perform training on one dataset, testing on the other, as well as training on a combination of datasets.
Our results suggest that there exist dataset biases in standard text-motion benchmarks such as HumanML3D, KIT Motion-Language, and BABEL.
arXiv Detail & Related papers (2024-05-27T07:58:20Z) - Learning to Paraphrase Sentences to Different Complexity Levels [3.0273878903284275]
Sentence simplification is an active research topic in NLP, but its adjacent tasks of sentence complexification and same-level paraphrasing are not.
To train models on all three tasks, we present two new unsupervised datasets.
arXiv Detail & Related papers (2023-08-04T09:43:37Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - M3ST: Mix at Three Levels for Speech Translation [66.71994367650461]
We propose Mix at three levels for Speech Translation (M3ST) method to increase the diversity of the augmented training corpus.
In the first stage of fine-tuning, we mix the training corpus at three levels, including word level, sentence level and frame level, and fine-tune the entire model with mixed data.
Experiments on MuST-C speech translation benchmark and analysis show that M3ST outperforms current strong baselines and achieves state-of-the-art results on eight directions with an average BLEU of 29.9.
arXiv Detail & Related papers (2022-12-07T14:22:00Z) - Dataset Distillation by Matching Training Trajectories [75.9031209877651]
We propose a new formulation that optimize our distilled data to guide networks to a similar state as those trained on real data.
Given a network, we train it for several iterations on our distilled data and optimize the distilled data with respect to the distance between the synthetically trained parameters and the parameters trained on real data.
Our method handily outperforms existing methods and also allows us to distill higher-resolution visual data.
arXiv Detail & Related papers (2022-03-22T17:58:59Z) - A Simple and Efficient Ensemble Classifier Combining Multiple Neural
Network Models on Social Media Datasets in Vietnamese [2.7528170226206443]
This study aims to classify Vietnamese texts on social media from three different Vietnamese benchmark datasets.
Advanced deep learning models are used and optimized in this study, including CNN, LSTM, and their variants.
Our ensemble model achieves the best performance on all three datasets.
arXiv Detail & Related papers (2020-09-28T04:28:48Z) - Improving Results on Russian Sentiment Datasets [0.0]
We show that for all sentiment tasks in this study the conversational variant of Russian BERT performs better.
The best results were achieved by BERT-NLI model, which treats sentiment classification tasks as a natural language inference task.
arXiv Detail & Related papers (2020-07-28T15:29:19Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.