Related papers: Author2Vec: A Framework for Generating User Embedding

Author2Vec: A Framework for Generating User Embedding

URL: http://arxiv.org/abs/2003.11627v1
Date: Tue, 17 Mar 2020 23:31:11 GMT
Title: Author2Vec: A Framework for Generating User Embedding
Authors: Xiaodong Wu, Weizhe Lin, Zhilin Wang, and Elena Rastorgueva
Abstract summary: We propose a novel end-to-end neural network-based user embedding system, Author2Vec. The model incorporates sentence representations generated by BERT with a novel unsupervised pre-training objective, authorship classification. Author2Vec successfully encoded useful user attributes and the generated user embedding performs well in downstream classification tasks.
Score: 5.805785001237604
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online forums and social media platforms provide noisy but valuable data every day. In this paper, we propose a novel end-to-end neural network-based user embedding system, Author2Vec. The model incorporates sentence representations generated by BERT (Bidirectional Encoder Representations from Transformers) with a novel unsupervised pre-training objective, authorship classification, to produce better user embedding that encodes useful user-intrinsic properties. This user embedding system was pre-trained on post data of 10k Reddit users and was analyzed and evaluated on two user classification benchmarks: depression detection and personality classification, in which the model proved to outperform traditional count-based and prediction-based methods. We substantiate that Author2Vec successfully encoded useful user attributes and the generated user embedding performs well in downstream classification tasks without further finetuning.

Related papers

Personalized Federated Collaborative Filtering: A Variational AutoEncoder Approach [49.63614966954833]
Federated Collaborative Filtering (FedCF) is an emerging field focused on developing a new recommendation framework with preserving privacy. Existing FedCF methods typically combine distributed Collaborative Filtering (CF) algorithms with privacy-preserving mechanisms, and then preserve personalized information into a user embedding vector. This paper proposes a novel personalized FedCF method by preserving users' personalized information into a latent variable and a neural model simultaneously.
arXiv Detail & Related papers (2024-08-16T05:49:14Z)
Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain. We propose an adversarial algorithm to make the retriever component robust against distribution shift. We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
Latent User Intent Modeling for Sequential Recommenders [92.66888409973495]
Sequential recommender models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform. Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online. Intent modeling is thus critical for understanding users and optimizing long-term user experience.
arXiv Detail & Related papers (2022-11-17T19:00:24Z)
Machine and Deep Learning Applications to Mouse Dynamics for Continuous User Authentication [0.0]
This article builds upon our previous published work by evaluating our dataset of 40 users using three machine learning and deep learning algorithms. The top performer is a 1-dimensional convolutional neural network with a peak average test accuracy of 85.73% across the top 10 users. Multi class classification is also examined using an artificial neural network which reaches an astounding peak accuracy of 92.48%.
arXiv Detail & Related papers (2022-05-26T21:43:59Z)
Class Token and Knowledge Distillation for Multi-head Self-Attention Speaker Verification Systems [20.55054374525828]
This paper explores three novel approaches to improve the performance of speaker verification systems based on deep neural networks (DNN) Firstly, we propose the use of a learnable vector called Class token to replace the average global pooling mechanism to extract the embeddings. Second, we have added a distilled representation token for training a teacher-student pair of networks using the Knowledge Distillation (KD) philosophy.
arXiv Detail & Related papers (2021-11-06T09:47:05Z)
PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models [9.630961791758168]
Malicious users can evade deep detection models by manipulating their behavior. Here we create a novel adversarial attack model against deep user sequence embedding-based classification models. In the attack, the adversary generates a new post to fool the classifier.
arXiv Detail & Related papers (2021-09-14T15:48:07Z)
Hierarchical Bi-Directional Self-Attention Networks for Paper Review Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation. Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three) We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z)
Towards Open-World Recommendation: An Inductive Model-based Collaborative Filtering Approach [115.76667128325361]
Recommendation models can effectively estimate underlying user interests and predict one's future behaviors. We propose an inductive collaborative filtering framework that contains two representation models. Our model achieves promising results for recommendation on few-shot users with limited training ratings and new unseen users.
arXiv Detail & Related papers (2020-07-09T14:31:25Z)
Federated Learning of User Authentication Models [69.93965074814292]
We propose Federated User Authentication (FedUA), a framework for privacy-preserving training of machine learning models. FedUA adopts federated learning framework to enable a group of users to jointly train a model without sharing the raw inputs. We show our method is privacy-preserving, scalable with number of users, and allows new users to be added to training without changing the output layer.
arXiv Detail & Related papers (2020-07-09T08:04:38Z)
Large-scale Hybrid Approach for Predicting User Satisfaction with Conversational Agents [28.668681892786264]
Measuring user satisfaction level is a challenging task, and a critical component in developing large-scale conversational agent systems. Human annotation based approaches are easier to control, but hard to scale. A novel alternative approach is to collect user's direct feedback via a feedback elicitation system embedded to the conversational agent system.
arXiv Detail & Related papers (2020-05-29T16:29:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.