QiBERT -- Classifying Online Conversations Messages with BERT as a Feature
- URL: http://arxiv.org/abs/2409.05530v1
- Date: Mon, 9 Sep 2024 11:38:06 GMT
- Title: QiBERT -- Classifying Online Conversations Messages with BERT as a Feature
- Authors: Bruno D. Ferreira-Saraiva, Zuil Pirola, João P. Matos-Carvalho, Manuel Marques-Pita,
- Abstract summary: This paper aims to use data obtained from online social conversations in Portuguese schools to observe behavioural trends.
This project used the state of the art (SoA) Machine Learning (ML) algorithms and methods, through BERT based models to classify if utterances are in or out of the debate subject.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent developments in online communication and their usage in everyday life have caused an explosion in the amount of a new genre of text data, short text. Thus, the need to classify this type of text based on its content has a significant implication in many areas. Online debates are no exception, once these provide access to information about opinions, positions and preferences of its users. This paper aims to use data obtained from online social conversations in Portuguese schools (short text) to observe behavioural trends and to see if students remain engaged in the discussion when stimulated. This project used the state of the art (SoA) Machine Learning (ML) algorithms and methods, through BERT based models to classify if utterances are in or out of the debate subject. Using SBERT embeddings as a feature, with supervised learning, the proposed model achieved results above 0.95 average accuracy for classifying online messages. Such improvements can help social scientists better understand human communication, behaviour, discussion and persuasion.
Related papers
- Analysis of the User Perception of Chatbots in Education Using A Partial
Least Squares Structural Equation Modeling Approach [0.0]
Key behavior-related aspects, such as Optimism, Innovativeness, Discomfort, Insecurity, Transparency, Ethics, Interaction, Engagement, and Accuracy, were studied.
Results showed that Optimism and Innovativeness are positively associated with Perceived Ease of Use (PEOU) and Perceived Usefulness (PU)
arXiv Detail & Related papers (2023-11-07T00:44:56Z) - Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend
Existing Ones? [57.16050211534735]
We investigate the types and frequency of free-text human feedback in commonly used dialog datasets.
Our findings provide new insights into the composition of the datasets examined, including error types, user response types, and the relations between them.
arXiv Detail & Related papers (2023-10-24T12:01:11Z) - Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News
Detection [50.07850264495737]
"Prompt-and-Align" (P&A) is a novel prompt-based paradigm for few-shot fake news detection.
We show that P&A sets new states-of-the-art for few-shot fake news detection performance by significant margins.
arXiv Detail & Related papers (2023-09-28T13:19:43Z) - Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual
Predatory Chats and Abusive Texts [2.406214748890827]
This paper proposes an approach to detection of online sexual predatory chats and abusive language using the open-source pretrained Llama 2 7B- parameter model.
We fine-tune the LLM using datasets with different sizes, imbalance degrees, and languages (i.e., English, Roman Urdu and Urdu)
Experimental results show a strong performance of the proposed approach, which performs proficiently and consistently across three distinct datasets.
arXiv Detail & Related papers (2023-08-28T16:18:50Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - NewsDialogues: Towards Proactive News Grounded Conversation [72.10055780635625]
We propose a novel task, Proactive News Grounded Conversation, in which a dialogue system can proactively lead the conversation based on some key topics of the news.
To further develop this novel task, we collect a human-to-human Chinese dialogue dataset tsNewsDialogues, which includes 1K conversations with a total of 14.6K utterances.
arXiv Detail & Related papers (2023-08-12T08:33:42Z) - Fine-Tuning Approach for Arabic Offensive Language Detection System:
BERT-Based Model [0.0]
This study investigates the effects of fine-tuning across several Arabic offensive language datasets.
We develop multiple classifiers that use four datasets individually and in combination to gain knowledge about online Arabic offensive content.
arXiv Detail & Related papers (2022-02-07T17:26:35Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z) - Transfer Learning Approach for Arabic Offensive Language Detection
System -- BERT-Based Model [0.0]
Cyberhate, online harassment and other misuses of technology are on the rise.
Applying advanced techniques from the Natural Language Processing (NLP) field to support the development of an online hate-free community is a critical task for social justice.
This study aims at investigating the effects of fine-tuning and training Bidirectional Representations from Transformers (BERT) model on multiple Arabic offensive language datasets individually.
arXiv Detail & Related papers (2021-02-09T04:58:18Z) - Using Machine Learning and Natural Language Processing Techniques to
Analyze and Support Moderation of Student Book Discussions [0.0]
The IMapBook project aims at improving the literacy and reading comprehension skills of elementary school-aged children by presenting them with interactive e-books and letting them take part in moderated book discussions.
This study aims to develop and illustrate a machine learning-based approach to message classification that could be used to automatically notify the discussion moderator of a possible need for an intervention and also to collect other useful information about the ongoing discussion.
arXiv Detail & Related papers (2020-11-23T20:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.