SocialBERT -- Transformers for Online SocialNetwork Language Modelling
- URL: http://arxiv.org/abs/2111.07148v1
- Date: Sat, 13 Nov 2021 16:37:15 GMT
- Title: SocialBERT -- Transformers for Online SocialNetwork Language Modelling
- Authors: Ilia Karpov and Nick Kartashev
- Abstract summary: We present SocialBERT - the first model that uses knowledge about the author's position in the network during text analysis.
The evaluation shows that embedding this information maintains a good generalization.
The proposed model has been trained on the majority of groups for the chosen social network, and still able to work with previously unknown groups.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ubiquity of the contemporary language understanding tasks gives relevance
to the development of generalized, yet highly efficient models that utilize all
knowledge, provided by the data source. In this work, we present SocialBERT -
the first model that uses knowledge about the author's position in the network
during text analysis. We investigate possible models for learning social
network information and successfully inject it into the baseline BERT model.
The evaluation shows that embedding this information maintains a good
generalization, with an increase in the quality of the probabilistic model for
the given author up to 7.5%. The proposed model has been trained on the
majority of groups for the chosen social network, and still able to work with
previously unknown groups. The obtained model, as well as the code of our
experiments, is available for download and use in applied tasks.
Related papers
- Information-Theoretic Distillation for Reference-less Summarization [67.51150817011617]
We present a novel framework to distill a powerful summarizer based on the information-theoretic objective for summarization.
We start off from Pythia-2.8B as the teacher model, which is not yet capable of summarization.
We arrive at a compact but powerful summarizer with only 568M parameters that performs competitively against ChatGPT.
arXiv Detail & Related papers (2024-03-20T17:42:08Z) - Social-LLM: Modeling User Behavior at Scale using Language Models and
Social Network Data [13.660150473547766]
We introduce a novel approach tailored for modeling social network data in user detection tasks.
Our method integrates localized social network interactions with the capabilities of large language models.
We conduct a thorough evaluation of our method across seven real-world social network datasets.
arXiv Detail & Related papers (2023-12-31T05:13:13Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Foundation Models for Natural Language Processing -- Pre-trained
Language Models Integrating Media [0.0]
Foundation Models are pre-trained language models for Natural Language Processing.
They can be applied to a wide range of different media and problem domains, ranging from image and video processing to robot control learning.
This book provides a comprehensive overview of the state of the art in research and applications of Foundation Models.
arXiv Detail & Related papers (2023-02-16T20:42:04Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Interpreting Language Models Through Knowledge Graph Extraction [42.97929497661778]
We compare BERT-based language models through snapshots of acquired knowledge at sequential stages of the training process.
We present a methodology to unveil a knowledge acquisition timeline by generating knowledge graph extracts from cloze "fill-in-the-blank" statements.
We extend this analysis to a comparison of pretrained variations of BERT models (DistilBERT, BERT-base, RoBERTa)
arXiv Detail & Related papers (2021-11-16T15:18:01Z) - Learning Purified Feature Representations from Task-irrelevant Labels [18.967445416679624]
We propose a novel learning framework called PurifiedLearning to exploit task-irrelevant features extracted from task-irrelevant labels.
Our work is built on solid theoretical analysis and extensive experiments, which demonstrate the effectiveness of PurifiedLearning.
arXiv Detail & Related papers (2021-02-22T12:50:49Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.