Low-resource Personal Attribute Prediction from Conversation
- URL: http://arxiv.org/abs/2211.15324v1
- Date: Mon, 28 Nov 2022 14:04:51 GMT
- Title: Low-resource Personal Attribute Prediction from Conversation
- Authors: Yinan Liu and Hu Chen and Wei Shen and Jiaoyan Chen
- Abstract summary: We propose a novel framework PEARL to predict personal attributes from conversations.
PEARL combines the biterm semantic information with the word co-occurrence information seamlessly via employing the updated prior attribute knowledge.
- Score: 20.873276038560057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Personal knowledge bases (PKBs) are crucial for a broad range of applications
such as personalized recommendation and Web-based chatbots. A critical
challenge to build PKBs is extracting personal attribute knowledge from users'
conversation data. Given some users of a conversational system, a personal
attribute and these users' utterances, our goal is to predict the ranking of
the given personal attribute values for each user. Previous studies often rely
on a relative number of resources such as labeled utterances and external data,
yet the attribute knowledge embedded in unlabeled utterances is underutilized
and their performance of predicting some difficult personal attributes is still
unsatisfactory. In addition, it is found that some text classification methods
could be employed to resolve this task directly. However, they also perform not
well over those difficult personal attributes. In this paper, we propose a
novel framework PEARL to predict personal attributes from conversations by
leveraging the abundant personal attribute knowledge from utterances under a
low-resource setting in which no labeled utterances or external data are
utilized. PEARL combines the biterm semantic information with the word
co-occurrence information seamlessly via employing the updated prior attribute
knowledge to refine the biterm topic model's Gibbs sampling process in an
iterative manner. The extensive experimental results show that PEARL
outperforms all the baseline methods not only on the task of personal attribute
prediction from conversations over two data sets, but also on the more general
weakly supervised text classification task over one data set.
Related papers
- PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models [3.516029765200171]
We propose a high-quality, personalized, manually annotated abstractive summarization dataset called PersonalSum.
This dataset is the first to investigate whether the focus of public readers differs from the generic summaries generated by Large Language Models.
arXiv Detail & Related papers (2024-10-04T20:12:39Z) - Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering [8.20929362102942]
Author profiling is the task of inferring characteristics about individuals by analyzing content they share.
We propose a new method for author profiling which aims at distinguishing relevant from irrelevant content first, followed by the actual user profiling only with relevant data.
We evaluate our method for Big Five personality trait prediction on two Twitter corpora.
arXiv Detail & Related papers (2024-09-06T08:43:10Z) - Speaker Profiling in Multiparty Conversations [31.518453682472575]
This research paper explores the task of Speaker Profiling in Conversations (SPC)
The primary objective of SPC is to produce a summary of persona characteristics for each individual speaker present in a dialogue.
To address the task of SPC, we have curated a new dataset named SPICE, which comes with specific labels.
arXiv Detail & Related papers (2023-04-18T08:04:46Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z) - Unsupervised Neural Stylistic Text Generation using Transfer learning
and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation.
We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z) - Personal Attribute Prediction from Conversations [9.208339833472051]
We aim to predict the personal attribute value for the user, which is helpful for the enrichment of personal knowledge bases (PKBs)
We propose a framework based on the pre-trained language model with a noise-robust loss function to predict personal attributes from conversations without requiring any labeled utterances.
Our framework obtains the best performance compared with all the twelve baselines in terms of nDCG and MRR.
arXiv Detail & Related papers (2022-08-29T15:21:53Z) - Dialogue History Matters! Personalized Response Selectionin Multi-turn
Retrieval-based Chatbots [62.295373408415365]
We propose a personalized hybrid matching network (PHMN) for context-response matching.
Our contributions are two-fold: 1) our model extracts personalized wording behaviors from user-specific dialogue history as extra matching information.
We evaluate our model on two large datasets with user identification, i.e., personalized dialogue Corpus Ubuntu (P- Ubuntu) and personalized Weibo dataset (P-Weibo)
arXiv Detail & Related papers (2021-03-17T09:42:11Z) - Vyaktitv: A Multimodal Peer-to-Peer Hindi Conversations based Dataset
for Personality Assessment [50.15466026089435]
We present a novel peer-to-peer Hindi conversation dataset- Vyaktitv.
It consists of high-quality audio and video recordings of the participants, with Hinglish textual transcriptions for each conversation.
The dataset also contains a rich set of socio-demographic features, like income, cultural orientation, amongst several others, for all the participants.
arXiv Detail & Related papers (2020-08-31T17:44:28Z) - Deep Active Learning with Crowdsourcing Data for Privacy Policy
Classification [6.5443502434659955]
Active learning and crowdsourcing techniques are used to develop an automated classification tool named Calpric.
Calpric is able to perform annotation equivalent to those done by skilled human annotators with high accuracy while minimizing the labeling cost.
Our model is able to achieve the same F1 score using only 62% of the original labeling effort.
arXiv Detail & Related papers (2020-08-07T02:13:31Z) - TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations [49.20701800683092]
We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation.
The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
arXiv Detail & Related papers (2020-05-23T06:21:26Z) - A Neural Topical Expansion Framework for Unstructured Persona-oriented
Dialogue Generation [52.743311026230714]
Persona Exploration and Exploitation (PEE) is able to extend the predefined user persona description with semantically correlated content.
PEE consists of two main modules: persona exploration and persona exploitation.
Our approach outperforms state-of-the-art baselines in terms of both automatic and human evaluations.
arXiv Detail & Related papers (2020-02-06T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.