Related papers: Rumor Detection on Twitter Using Multiloss Hierarchical BiLSTM with an Attenuation Factor

Rumor Detection on Twitter Using Multiloss Hierarchical BiLSTM with an Attenuation Factor

URL: http://arxiv.org/abs/2011.00259v2
Date: Mon, 14 Dec 2020 11:20:30 GMT
Title: Rumor Detection on Twitter Using Multiloss Hierarchical BiLSTM with an Attenuation Factor
Authors: Yudianto Sujana, Jiawen Li, Hung-Yu Kao
Abstract summary: Social media platforms such as Twitter have become a breeding ground for unverified information or rumors. Our model achieves better performance than that of state-of-the-art machine learning and vanilla deep learning models.
Score: 14.717465036484292
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Social media platforms such as Twitter have become a breeding ground for unverified information or rumors. These rumors can threaten people's health, endanger the economy, and affect the stability of a country. Many researchers have developed models to classify rumors using traditional machine learning or vanilla deep learning models. However, previous studies on rumor detection have achieved low precision and are time consuming. Inspired by the hierarchical model and multitask learning, a multiloss hierarchical BiLSTM model with an attenuation factor is proposed in this paper. The model is divided into two BiLSTM modules: post level and event level. By means of this hierarchical structure, the model can extract deep in-formation from limited quantities of text. Each module has a loss function that helps to learn bilateral features and reduce the training time. An attenuation fac-tor is added at the post level to increase the accuracy. The results on two rumor datasets demonstrate that our model achieves better performance than that of state-of-the-art machine learning and vanilla deep learning models.

Related papers

Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging [103.98582374569789]
Model merging aims to combine multiple expert models into a single model, thereby reducing storage and serving costs.<n>Previous studies have primarily focused on merging visual classification models or Large Language Models (LLMs) for code and math tasks.<n>We introduce the model merging benchmark for MLLMs, which includes multiple tasks such as VQA, Geometry, Chart, OCR, and Grounding, providing both LoRA and full fine-tuning models.
arXiv Detail & Related papers (2025-05-26T12:23:14Z)
Machine-generated text detection prevents language model collapse [17.34282527020344]
We investigate the impact of decoding strategy on model collapse.<n>We train a machine-generated text detector and propose an importance sampling approach to alleviate model collapse.
arXiv Detail & Related papers (2025-02-21T18:22:36Z)
Inheritune: Training Smaller Yet More Attentive Language Models [61.363259848264725]
Inheritune is a simple yet effective training recipe for developing smaller, high-performing language models. We demonstrate that Inheritune enables the training of various sizes of GPT-2 models on datasets like OpenWebText-9B and FineWeb_edu.
arXiv Detail & Related papers (2024-04-12T17:53:34Z)
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models [51.5543321122664]
This paper investigates how to better leverage large-scale pre-trained uni-modal models to enhance discriminative multi-modal learning. We introduce Multi-Modal Low-Rank Adaptation learning (MMLoRA)
arXiv Detail & Related papers (2023-10-08T15:01:54Z)
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length. This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z)
On the Steganographic Capacity of Selected Learning Models [1.0640226829362012]
We consider the question of the steganographic capacity of learning models. For a wide range of models, we determine the number of low-order bits that can be overwritten. Of the models tested, the steganographic capacity ranges from 7.04 KB for our LR experiments, to 44.74 MB for InceptionV3.
arXiv Detail & Related papers (2023-08-29T10:41:34Z)
Steganographic Capacity of Deep Learning Models [12.974139332068491]
We consider the steganographic capacity of several learning models. We train a Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), and Transformer model on a challenging malware classification problem. We find that the steganographic capacity of the learning models tested is surprisingly high, and that in each case, there is a clear threshold after which model performance rapidly degrades.
arXiv Detail & Related papers (2023-06-25T13:43:35Z)
Two Independent Teachers are Better Role Model [7.001845833295753]
We propose a new deep learning model called 3D-DenseUNet. It works as adaptable global aggregation blocks in down-sampling to solve the issue of spatial information loss. We also propose a new method called Two Independent Teachers, that summarizes the model weights instead of label predictions.
arXiv Detail & Related papers (2023-06-09T08:22:41Z)
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning [0.0]
This study compares Transformer and LSTM with attention block model on MS-COCO dataset. Transformer and LSTM with attention block models have been discussed with state of the art accuracy.
arXiv Detail & Related papers (2023-03-05T11:45:53Z)
Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP) What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)
Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition [11.489161072526677]
We investigate robustness properties of pre-trained neural models for automatic speech recognition. In this work, we perform a robustness analysis of the pre-trained neural models wav2vec2, HuBERT and DistilHuBERT on the LibriSpeech and TIMIT datasets.
arXiv Detail & Related papers (2022-08-17T20:00:54Z)
Self-Supervised Learning for speech recognition with Intermediate layer supervision [52.93758711230248]
We propose Intermediate Layer Supervision for Self-Supervised Learning (ILS-SSL) ILS-SSL forces the model to concentrate on content information as much as possible by adding an additional SSL loss on the intermediate layers. Experiments on LibriSpeech test-other set show that our method outperforms HuBERT significantly.
arXiv Detail & Related papers (2021-12-16T10:45:05Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.